Can't add disk to failed raid array

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Can't add disk to failed raid array
@ 2006-07-16  0:56 Paul Waldo
  2006-07-16 10:19 ` Neil Brown
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-16  0:56 UTC (permalink / raw)
  To: linux-raid

Hi all,

I have a RAID6 array where a disk went bad.  I removed the old disk, put in an 
identical one, and repartitioned the new disk.  I am now trying to add the 
new partition to the array, but I get this error: 

[root@paul ~]# mdadm --add /dev/md1  /dev/hdd2
mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument

When I perform that command, /var/log/messages says this:
Jul 15 20:48:39 paul kernel: md: hdd2 has invalid sb, not importing!
Jul 15 20:48:39 paul kernel: md: md_import_device returned -22


Below is the relevant data.  What might I be doing wrong?  Thanks in advance!

Paul

[root@paul ~]# cat /proc/mdstat
Personalities : [raid6] [raid1]
md0 : active raid1 hdd1[6](S) hda1[0] hdc1[1] hde1[2](S) hdg1[3](S) sda1[4](S) 
sdb1[5](S)
      979840 blocks [2/2] [UU]

md1 : active raid6 sdb2[6] sda2[5] hdg2[4] hde2[3] hdc2[1] hda2[0]
      776541440 blocks level 6, 256k chunk, algorithm 2 [7/6] [UU_UUUU]

unused devices: <none>
[root@paul ~]# fdisk -l /dev/hdd

Disk /dev/hdd: 160.0 GB, 160029999616 bytes
255 heads, 63 sectors/track, 19455 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdd1               1         122      979933+  fd  Linux raid autodetect
/dev/hdd2             123       19455   155292322+  fd  Linux raid autodetect

[root@paul log]# mdadm --detail /dev/md1
/dev/md1:
        Version : 00.90.03
  Creation Time : Fri Jun 23 22:35:27 2006
     Raid Level : raid6
     Array Size : 776541440 (740.57 GiB 795.18 GB)
    Device Size : 155308288 (148.11 GiB 159.04 GB)
   Raid Devices : 7
  Total Devices : 6
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sat Jul 15 20:53:29 2006
          State : clean, degraded
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 256K

           UUID : 2e316d9e:20cac82a:2555918e:bb9acc07
         Events : 0.1396384

    Number   Major   Minor   RaidDevice State
       0       3        2        0      active sync   /dev/hda2
       1      22        2        1      active sync   /dev/hdc2
   3157553       0        0        5      removed
       3      33        2        3      active sync   /dev/hde2
       4      34        2        4      active sync   /dev/hdg2
       5       8        2        5      active sync   /dev/sda2
       6       8       18        6      active sync   /dev/sdb2

[root@paul log]# uname -rv
2.6.17-1.2139_FC5 #1 Fri Jun 23 12:40:16 EDT 2006

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Can't add disk to failed raid array
  2006-07-16  0:56 Can't add disk to failed raid array Paul Waldo
@ 2006-07-16 10:19 ` Neil Brown
  2006-07-16 12:24   ` Paul Waldo
       [not found] ` <200607160913.32005.pwaldo@waldoware.com>
  2006-07-17 18:36 ` Paul Waldo
  2 siblings, 1 reply; 22+ messages in thread
From: Neil Brown @ 2006-07-16 10:19 UTC (permalink / raw)
  To: Paul Waldo; +Cc: linux-raid

On Saturday July 15, pwaldo@waldoware.com wrote:
> Hi all,
> 
> I have a RAID6 array where a disk went bad.  I removed the old disk, put in an 
> identical one, and repartitioned the new disk.  I am now trying to add the 
> new partition to the array, but I get this error: 
> 
> [root@paul ~]# mdadm --add /dev/md1  /dev/hdd2
> mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument
> 
> When I perform that command, /var/log/messages says this:
> Jul 15 20:48:39 paul kernel: md: hdd2 has invalid sb, not importing!
> Jul 15 20:48:39 paul kernel: md: md_import_device returned -22


Rings a bell, but I cannot quite place it..

What version of mdadm are you running?  If not 2.5.2, try that.

...
>     Number   Major   Minor   RaidDevice State
>        0       3        2        0      active sync   /dev/hda2
>        1      22        2        1      active sync   /dev/hdc2
>    3157553       0        0        5      removed
     ^^^^^^^
That looks very odd.  If 2.5.2 does that I'll have to look into why.

NeilBrown


>        3      33        2        3      active sync   /dev/hde2
>        4      34        2        4      active sync   /dev/hdg2
>        5       8        2        5      active sync   /dev/sda2
>        6       8       18        6      active sync   /dev/sdb2
> 
> [root@paul log]# uname -rv
> 2.6.17-1.2139_FC5 #1 Fri Jun 23 12:40:16 EDT 2006
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Can't add disk to failed raid array
  2006-07-16 10:19 ` Neil Brown
@ 2006-07-16 12:24   ` Paul Waldo
  2006-07-18  5:36     ` Neil Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Waldo @ 2006-07-16 12:24 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Thanks for the reply, Neil.  Here is my version:
[root@paul log]# mdadm --version
mdadm - v2.3.1 - 6 February 2006

This is a somewhat production system, running Fedora Core 5.  Official 
packages containing mdadm at version 2.5.2 aren't available (to my 
knowledge), and I am very hesitant to experiment with non-official 
software :-(

Would "mdadm --assemble" be of use to me here?  From the message "md: hdd2 has 
invalid sb, not importing!", it seems I need to get an "sb" (assumed to be 
super block) on that partition.  Would --assemble work and not destroy the 
existing good array?

As another data point, when I replaced the bad disk, I was able to 
add /dev/hdd1 to /dev/md0 with no problem.  md0 is a RAID 1 array and hdd1 
was a spare, so there may be no relation at all...

Thanks for your help!

On Sunday 16 July 2006 6:19 am, Neil Brown wrote:
> On Saturday July 15, pwaldo@waldoware.com wrote:
> > Hi all,
> >
> > I have a RAID6 array where a disk went bad.  I removed the old disk, put
> > in an identical one, and repartitioned the new disk.  I am now trying to
> > add the new partition to the array, but I get this error:
> >
> > [root@paul ~]# mdadm --add /dev/md1  /dev/hdd2
> > mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument
> >
> > When I perform that command, /var/log/messages says this:
> > Jul 15 20:48:39 paul kernel: md: hdd2 has invalid sb, not importing!
> > Jul 15 20:48:39 paul kernel: md: md_import_device returned -22
>
> Rings a bell, but I cannot quite place it..
>
> What version of mdadm are you running?  If not 2.5.2, try that.
>
> ...
>
> >     Number   Major   Minor   RaidDevice State
> >        0       3        2        0      active sync   /dev/hda2
> >        1      22        2        1      active sync   /dev/hdc2
> >    3157553       0        0        5      removed
>
>      ^^^^^^^
> That looks very odd.  If 2.5.2 does that I'll have to look into why.
>
> NeilBrown
>
> >        3      33        2        3      active sync   /dev/hde2
> >        4      34        2        4      active sync   /dev/hdg2
> >        5       8        2        5      active sync   /dev/sda2
> >        6       8       18        6      active sync   /dev/sdb2
> >
> > [root@paul log]# uname -rv
> > 2.6.17-1.2139_FC5 #1 Fri Jun 23 12:40:16 EDT 2006
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Can't add disk to failed raid array
       [not found]   ` <Pine.LNX.4.62.0607161524170.7520@uplift.swm.pp.se>
@ 2006-07-16 14:39     ` Paul Waldo
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-16 14:39 UTC (permalink / raw)
  To: linux-raid; +Cc: Mikael Abrahamsson

Still no joy after zeroing the superblock,  Mikael.  :-(

[root@paul ~]# mdadm --examine /dev/hdd2
/dev/hdd2:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 2e316d9e:20cac82a:2555918e:bb9acc07
  Creation Time : Fri Jun 23 22:35:27 2006
     Raid Level : raid6
    Device Size : 155308288 (148.11 GiB 159.04 GB)
     Array Size : 776541440 (740.57 GiB 795.18 GB)
   Raid Devices : 7
  Total Devices : 6
Preferred Minor : 1

    Update Time : Sun Jul 16 09:10:41 2006
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 6bdf5797 - correct
         Events : 0.1401144


      Number   Major   Minor   RaidDevice State
this     2      22       66       -1      sync   /dev/hdd2

   0     0       3        2        0      active sync   /dev/hda2
   1     1      22        2        1      active sync   /dev/hdc2
   2     2      22       66       -1      sync   /dev/hdd2
   3     3      33        2        3      active sync   /dev/hde2
   4     4      34        2        4      active sync   /dev/hdg2
   5     5       8        2        5      active sync   /dev/sda2
   6     6       8       18        6      active sync   /dev/sdb2
[root@paul ~]# cat /proc/mdstat
Personalities : [raid6] [raid1]
md0 : active raid1 hda1[0] hdc1[1] hdd1[2](S) hde1[3](S) hdg1[4](S) sda1[5](S) 
sdb1[6](S)
      979840 blocks [2/2] [UU]

md1 : active raid6 sdb2[6] sda2[5] hdg2[4] hde2[3] hdc2[1] hda2[0]
      776541440 blocks level 6, 256k chunk, algorithm 2 [7/6] [UU_UUUU]

unused devices: <none>
[root@paul ~]# mdadm --examine /dev/hdd2|grep -i super
[root@paul ~]# mdadm --misc --zero-superblock /dev/hdd2
[root@paul ~]# cat /proc/mdstat
Personalities : [raid6] [raid1]
md0 : active raid1 hda1[0] hdc1[1] hdd1[2](S) hde1[3](S) hdg1[4](S) sda1[5](S) 
sdb1[6](S)
      979840 blocks [2/2] [UU]

md1 : active raid6 sdb2[6] sda2[5] hdg2[4] hde2[3] hdc2[1] hda2[0]
      776541440 blocks level 6, 256k chunk, algorithm 2 [7/6] [UU_UUUU]

unused devices: <none>
[root@paul ~]# mdadm --add /dev/md1  /dev/hdd2
mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument
[root@paul ~]# tail -5 /var/log/messages
Jul 16 09:10:42 paul kernel: md: md_import_device returned -22
Jul 16 09:10:42 paul kernel: md: hdd2 has invalid sb, not importing!
Jul 16 09:10:42 paul kernel: md: md_import_device returned -22
Jul 16 10:33:33 paul kernel: md: hdd2 has invalid sb, not importing!
Jul 16 10:33:33 paul kernel: md: md_import_device returned -22


On Sunday 16 July 2006 9:26 am, you wrote:
> On Sun, 16 Jul 2006, Paul Waldo wrote:
>
> The superblock is at the end so you probably didnt clear it. Use "mdadm
> --examine /dev/hdd2" and see what it says. If it does say it has a
> superblock, use "--misc --zero-superblock /dev/hdd2"

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Can't add disk to failed raid array
  2006-07-16  0:56 Can't add disk to failed raid array Paul Waldo
  2006-07-16 10:19 ` Neil Brown
       [not found] ` <200607160913.32005.pwaldo@waldoware.com>
@ 2006-07-17 18:36 ` Paul Waldo
  2 siblings, 0 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-17 18:36 UTC (permalink / raw)
  To: linux-raid

All has been quiet on this topic for a while--any more takers?  Please 
help if you can!  Thanks in advance.  Here is the current state of affairs:

[root@paul ~]# mdadm --add /dev/md1 /dev/hdd2
mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument
[root@paul ~]# mdadm --detail /dev/md1
/dev/md1:
         Version : 00.90.03
   Creation Time : Fri Jun 23 22:35:27 2006
      Raid Level : raid6
      Array Size : 776541440 (740.57 GiB 795.18 GB)
     Device Size : 155308288 (148.11 GiB 159.04 GB)
    Raid Devices : 7
   Total Devices : 6
Preferred Minor : 1
     Persistence : Superblock is persistent

     Update Time : Mon Jul 17 14:02:13 2006
           State : clean, degraded
  Active Devices : 6
Working Devices : 6
  Failed Devices : 0
   Spare Devices : 0

      Chunk Size : 256K

            UUID : 2e316d9e:20cac82a:2555918e:bb9acc07
          Events : 0.1416894

     Number   Major   Minor   RaidDevice State
        0       3        2        0      active sync   /dev/hda2
        1      22        2        1      active sync   /dev/hdc2
        0       0        0    1883272037      removed
        3      33        2        3      active sync   /dev/hde2
        4      34        2        4      active sync   /dev/hdg2
        5       8        2        5      active sync   /dev/sda2
        6       8       18        6      active sync   /dev/sdb2
[root@paul ~]# fdisk -l /dev/hdd

Disk /dev/hdd: 160.0 GB, 160029999616 bytes
255 heads, 63 sectors/track, 19455 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot      Start         End      Blocks   Id  System
/dev/hdd1               1         122      979933+  fd  Linux raid 
autodetect
/dev/hdd2             123       19455   155292322+  fd  Linux raid 
autodetect

[root@paul ~]# tail -5 /var/log/messages
Jul 17 14:02:08 paul kernel: md: md_import_device returned -22
Jul 17 14:35:47 paul kernel: md: hdd2 has invalid sb, not importing!
Jul 17 14:35:47 paul kernel: md: md_import_device returned -22
Jul 17 14:35:47 paul kernel: md: hdd2 has invalid sb, not importing!
Jul 17 14:35:47 paul kernel: md: md_import_device returned -22


Paul Waldo wrote:
> Hi all,
> 
> I have a RAID6 array where a disk went bad.  I removed the old disk, put in an 
> identical one, and repartitioned the new disk.  I am now trying to add the 
> new partition to the array, but I get this error: 
> 
> [root@paul ~]# mdadm --add /dev/md1  /dev/hdd2
> mdadm: add new device failed for /dev/hdd2 as 2: Invalid argument
> 
> When I perform that command, /var/log/messages says this:
> Jul 15 20:48:39 paul kernel: md: hdd2 has invalid sb, not importing!
> Jul 15 20:48:39 paul kernel: md: md_import_device returned -22
> 
> 
> Below is the relevant data.  What might I be doing wrong?  Thanks in advance!
> 
> Paul
> 
> [root@paul ~]# cat /proc/mdstat
> Personalities : [raid6] [raid1]
> md0 : active raid1 hdd1[6](S) hda1[0] hdc1[1] hde1[2](S) hdg1[3](S) sda1[4](S) 
> sdb1[5](S)
>       979840 blocks [2/2] [UU]
> 
> md1 : active raid6 sdb2[6] sda2[5] hdg2[4] hde2[3] hdc2[1] hda2[0]
>       776541440 blocks level 6, 256k chunk, algorithm 2 [7/6] [UU_UUUU]
> 
> unused devices: <none>
> [root@paul ~]# fdisk -l /dev/hdd
> 
> Disk /dev/hdd: 160.0 GB, 160029999616 bytes
> 255 heads, 63 sectors/track, 19455 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/hdd1               1         122      979933+  fd  Linux raid autodetect
> /dev/hdd2             123       19455   155292322+  fd  Linux raid autodetect
> 
> [root@paul log]# mdadm --detail /dev/md1
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Fri Jun 23 22:35:27 2006
>      Raid Level : raid6
>      Array Size : 776541440 (740.57 GiB 795.18 GB)
>     Device Size : 155308288 (148.11 GiB 159.04 GB)
>    Raid Devices : 7
>   Total Devices : 6
> Preferred Minor : 1
>     Persistence : Superblock is persistent
> 
>     Update Time : Sat Jul 15 20:53:29 2006
>           State : clean, degraded
>  Active Devices : 6
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 0
> 
>      Chunk Size : 256K
> 
>            UUID : 2e316d9e:20cac82a:2555918e:bb9acc07
>          Events : 0.1396384
> 
>     Number   Major   Minor   RaidDevice State
>        0       3        2        0      active sync   /dev/hda2
>        1      22        2        1      active sync   /dev/hdc2
>    3157553       0        0        5      removed
>        3      33        2        3      active sync   /dev/hde2
>        4      34        2        4      active sync   /dev/hdg2
>        5       8        2        5      active sync   /dev/sda2
>        6       8       18        6      active sync   /dev/sdb2
> 
> [root@paul log]# uname -rv
> 2.6.17-1.2139_FC5 #1 Fri Jun 23 12:40:16 EDT 2006


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Can't add disk to failed raid array
  2006-07-16 12:24   ` Paul Waldo
@ 2006-07-18  5:36     ` Neil Brown
  2006-07-23 11:04       ` In Trouble--Please Help! (was Re: Can't add disk to failed raid array) Paul Waldo
  0 siblings, 1 reply; 22+ messages in thread
From: Neil Brown @ 2006-07-18  5:36 UTC (permalink / raw)
  To: Paul Waldo; +Cc: linux-raid

On Sunday July 16, pwaldo@waldoware.com wrote:
> Thanks for the reply, Neil.  Here is my version:
> [root@paul log]# mdadm --version
> mdadm - v2.3.1 - 6 February 2006

Positively ancient :-)  Nothing obvious in the change log since then.

Can you show me the output of 
  mdadm -E /dev/hdd2
  mdadm -E /dev/hda2

immediately after the failed attempt to add hdd2.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-18  5:36     ` Neil Brown
@ 2006-07-23 11:04       ` Paul Waldo
  2006-07-23 11:25         ` Neil Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Waldo @ 2006-07-23 11:04 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Please, please! I am dead in the water!

To recap, I have a RAID6 array on a Fedora Core 5 system, using /dev/hd[acdeg]
2 and /dev/sd[ab]2.  /dev/hdd went bad so I replaced the drive and tried to 
add it back to the array.  Here is what happens:

#mdadm --assemble /dev/md1 /dev/hd[acdeg]2 /dev/sd[ab]2
mdadm: no RAID superblock on /dev/hdd2
mdadm: /dev/hdd2 has no superblock - assembly aborted

I also tried assembling without the new drive:
#mdadm --assemble /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
mdadm: failed to RUN_ARRAY /dev/md1: Input/output error

Forgive me if the messages are not exactly correct, as I am booted into the FC 
rescue disk and I am transcribing what I see to another computer.

How can I get mdadm to use the new drive?  Am I completely fscked?  Thanks in 
advance for any help!!!!

Paul


On Tuesday 18 July 2006 01:36, Neil Brown wrote:
> On Sunday July 16, pwaldo@waldoware.com wrote:
> > Thanks for the reply, Neil.  Here is my version:
> > [root@paul log]# mdadm --version
> > mdadm - v2.3.1 - 6 February 2006
>
> Positively ancient :-)  Nothing obvious in the change log since then.
>
> Can you show me the output of
>   mdadm -E /dev/hdd2
>   mdadm -E /dev/hda2
>
> immediately after the failed attempt to add hdd2.
>
> Thanks,
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 11:04       ` In Trouble--Please Help! (was Re: Can't add disk to failed raid array) Paul Waldo
@ 2006-07-23 11:25         ` Neil Brown
  2006-07-23 11:50           ` Paul Waldo
  2006-07-23 11:53           ` Paul Waldo
  0 siblings, 2 replies; 22+ messages in thread
From: Neil Brown @ 2006-07-23 11:25 UTC (permalink / raw)
  To: Paul Waldo; +Cc: linux-raid

On Sunday July 23, pwaldo@waldoware.com wrote:
> Please, please! I am dead in the water!
> 
> To recap, I have a RAID6 array on a Fedora Core 5 system, using /dev/hd[acdeg]
> 2 and /dev/sd[ab]2.  /dev/hdd went bad so I replaced the drive and tried to 
> add it back to the array.  Here is what happens:
> 
> #mdadm --assemble /dev/md1 /dev/hd[acdeg]2 /dev/sd[ab]2
> mdadm: no RAID superblock on /dev/hdd2
> mdadm: /dev/hdd2 has no superblock - assembly aborted
> 
> I also tried assembling without the new drive:
> #mdadm --assemble /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
> 
> Forgive me if the messages are not exactly correct, as I am booted into the FC 
> rescue disk and I am transcribing what I see to another computer.
> 
> How can I get mdadm to use the new drive?  Am I completely fscked?  Thanks in 
> advance for any help!!!!

Sorry for not following through on this earlier.
I think I know what the problem is.

 From the "-E" output you have me :
     Device Size : 155308288 (148.11 GiB 159.04 GB)

 From the fdisk output in the original email:

[root@paul ~]# fdisk -l /dev/hdd

Disk /dev/hdd: 160.0 GB, 160029999616 bytes
255 heads, 63 sectors/track, 19455 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hdd1               1         122      979933+  fd  Linux raid autodetect
/dev/hdd2             123       19455   155292322+  fd  Linux raid autodetect


Notice that hdd2 is 155292322, but the device needs to be 
155308288.  It isn't quite big enough.
mdadm should pick this up, but obviously doesn't, and the error message
isn't at all helpful.
I will fix that in the next release.
For now, repartition your drive so that hdd2 is slightly larger.

NeilBrown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 11:25         ` Neil Brown
@ 2006-07-23 11:50           ` Paul Waldo
  2006-07-23 11:53           ` Paul Waldo
  1 sibling, 0 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-23 11:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Sunday 23 July 2006 07:25, Neil Brown wrote:
> On Sunday July 23, pwaldo@waldoware.com wrote:
> > Please, please! I am dead in the water!
> >
> > To recap, I have a RAID6 array on a Fedora Core 5 system, using
> > /dev/hd[acdeg] 2 and /dev/sd[ab]2.  /dev/hdd went bad so I replaced the
> > drive and tried to add it back to the array.  Here is what happens:
> >
> > #mdadm --assemble /dev/md1 /dev/hd[acdeg]2 /dev/sd[ab]2
> > mdadm: no RAID superblock on /dev/hdd2
> > mdadm: /dev/hdd2 has no superblock - assembly aborted
> >
> > I also tried assembling without the new drive:
> > #mdadm --assemble /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
> > mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
> >
> > Forgive me if the messages are not exactly correct, as I am booted into
> > the FC rescue disk and I am transcribing what I see to another computer.
> >
> > How can I get mdadm to use the new drive?  Am I completely fscked? 
> > Thanks in advance for any help!!!!
>
> Sorry for not following through on this earlier.
> I think I know what the problem is.
>
>  From the "-E" output you have me :
>      Device Size : 155308288 (148.11 GiB 159.04 GB)
>
>  From the fdisk output in the original email:
>
> [root@paul ~]# fdisk -l /dev/hdd
>
> Disk /dev/hdd: 160.0 GB, 160029999616 bytes
> 255 heads, 63 sectors/track, 19455 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/hdd1               1         122      979933+  fd  Linux raid
> autodetect /dev/hdd2             123       19455   155292322+  fd  Linux
> raid autodetect
>
>
> Notice that hdd2 is 155292322, but the device needs to be
> 155308288.  It isn't quite big enough.
> mdadm should pick this up, but obviously doesn't, and the error message
> isn't at all helpful.
> I will fix that in the next release.
> For now, repartition your drive so that hdd2 is slightly larger.
>
> NeilBrown

Hi Neil,

Thanks for the quick reply.  I tried your suggestion.  I now see 
that /dev/hda2 (good partition) is 155308387 blocks.  I 
repartitioned /dev/hdd such that /dev/hdd2 is also 155308387, but I still get 
the message about no superblock on hdd2 :-(.  I used --force.

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 11:25         ` Neil Brown
  2006-07-23 11:50           ` Paul Waldo
@ 2006-07-23 11:53           ` Paul Waldo
  2006-07-23 11:59             ` Neil Brown
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Waldo @ 2006-07-23 11:53 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Sunday 23 July 2006 07:25, Neil Brown wrote:
> On Sunday July 23, pwaldo@waldoware.com wrote:
> > Please, please! I am dead in the water!
> >
> > To recap, I have a RAID6 array on a Fedora Core 5 system, using
> > /dev/hd[acdeg] 2 and /dev/sd[ab]2.  /dev/hdd went bad so I replaced the
> > drive and tried to add it back to the array.  Here is what happens:
> >
> > #mdadm --assemble /dev/md1 /dev/hd[acdeg]2 /dev/sd[ab]2
> > mdadm: no RAID superblock on /dev/hdd2
> > mdadm: /dev/hdd2 has no superblock - assembly aborted
> >
> > I also tried assembling without the new drive:
> > #mdadm --assemble /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
> > mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
> >
> > Forgive me if the messages are not exactly correct, as I am booted into
> > the FC rescue disk and I am transcribing what I see to another computer.
> >
> > How can I get mdadm to use the new drive?  Am I completely fscked? 
> > Thanks in advance for any help!!!!
>
> Sorry for not following through on this earlier.
> I think I know what the problem is.
>
>  From the "-E" output you have me :
>      Device Size : 155308288 (148.11 GiB 159.04 GB)
>
>  From the fdisk output in the original email:
>
> [root@paul ~]# fdisk -l /dev/hdd
>
> Disk /dev/hdd: 160.0 GB, 160029999616 bytes
> 255 heads, 63 sectors/track, 19455 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/hdd1               1         122      979933+  fd  Linux raid
> autodetect /dev/hdd2             123       19455   155292322+  fd  Linux
> raid autodetect
>
>
> Notice that hdd2 is 155292322, but the device needs to be
> 155308288.  It isn't quite big enough.
> mdadm should pick this up, but obviously doesn't, and the error message
> isn't at all helpful.
> I will fix that in the next release.
> For now, repartition your drive so that hdd2 is slightly larger.
>
> NeilBrown


At this point, I'd just be happy to be able to get the degraded array back up 
and running.  Is there any way to do that?  Thanks!

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 11:53           ` Paul Waldo
@ 2006-07-23 11:59             ` Neil Brown
  2006-07-23 12:32               ` Paul Waldo
  0 siblings, 1 reply; 22+ messages in thread
From: Neil Brown @ 2006-07-23 11:59 UTC (permalink / raw)
  To: Paul Waldo; +Cc: linux-raid

On Sunday July 23, pwaldo@waldoware.com wrote:
> 
> 
> At this point, I'd just be happy to be able to get the degraded array back up 
> and running.  Is there any way to do that?  Thanks!

 mdadm --assemble --force /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2

should get you the degraded array.  If not, what kernel log messages
do you get?



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 11:59             ` Neil Brown
@ 2006-07-23 12:32               ` Paul Waldo
  2006-07-24 18:26                 ` Dan Williams
  2006-07-24 23:24                 ` Neil Brown
  0 siblings, 2 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-23 12:32 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 480 bytes --]

Here is the dmesg output.  No log files are created with the FC5 rescue disk.  
Thanks!



On Sunday 23 July 2006 07:59, Neil Brown wrote:
> On Sunday July 23, pwaldo@waldoware.com wrote:
> > At this point, I'd just be happy to be able to get the degraded array
> > back up and running.  Is there any way to do that?  Thanks!
>
>  mdadm --assemble --force /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
>
> should get you the degraded array.  If not, what kernel log messages
> do you get?

[-- Attachment #2: dmesg.log --]
[-- Type: text/x-log, Size: 7704 bytes --]

[APCG] enabled at IRQ 20
ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [APCG] -> GSI 20 (level, high) -> IRQ 19
PCI: Setting latency timer of device 0000:00:02.1 to 64
ohci_hcd 0000:00:02.1: OHCI Host Controller
ohci_hcd 0000:00:02.1: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:02.1: irq 19, io mem 0xe2082000
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 3 ports detected
usb 1-3: configuration #1 chosen from 1 choice
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
  Vendor: HL-DT-ST  Model: DVDRRW GSA-2166D  Rev: 1.01
  Type:   CD-ROM                             ANSI SCSI revision: 00
sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 0:0:0:0: Attached scsi CD-ROM sr0
usb-storage: device scan complete
libata version 1.20 loaded.
sata_sil 0000:01:0d.0: version 0.9
ACPI: PCI Interrupt Link [APC3] enabled at IRQ 18
ACPI: PCI Interrupt 0000:01:0d.0[A] -> Link [APC3] -> GSI 18 (level, high) -> IRQ 20
ata1: SATA max UDMA/100 cmd 0xE090C080 ctl 0xE090C08A bmdma 0xE090C000 irq 20
ata2: SATA max UDMA/100 cmd 0xE090C0C0 ctl 0xE090C0CA bmdma 0xE090C008 irq 20
ata1: SATA link up 1.5 Gbps (SStatus 113)
ata1: dev 0 cfg 49:2f00 82:346b 83:7f01 84:4003 85:3c69 86:3c01 87:4003 88:20ff
ata1: dev 0 ATA-7, max UDMA7, 312581808 sectors: LBA48
ata1: dev 0 configured for UDMA/100
scsi1 : sata_sil
ata2: SATA link up 1.5 Gbps (SStatus 113)
ata2: dev 0 cfg 49:2f00 82:346b 83:7f01 84:4003 85:3c69 86:3c01 87:4003 88:20ff
ata2: dev 0 ATA-7, max UDMA7, 312581808 sectors: LBA48
ata2: dev 0 configured for UDMA/100
scsi2 : sata_sil
  Vendor: ATA       Model: SAMSUNG SP1614C   Rev: SW10
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
 sda: sda1 sda2
sd 1:0:0:0: Attached scsi disk sda
  Vendor: ATA       Model: SAMSUNG SP1614C   Rev: SW10
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 312581808 512-byte hdwr sectors (160042 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2
sd 2:0:0:0: Attached scsi disk sdb
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.49.
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 22
ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 22 (level, high) -> IRQ 17
PCI: Setting latency timer of device 0000:00:04.0 to 64
eth0: forcedeth.c: subsystem: 010de:05b2 bound to 0000:00:04.0
ISO 9660 Extensions: Microsoft Joliet Level 3
Unable to load NLS charset utf8
Unable to load NLS charset utf8
ISO 9660 Extensions: RRIP_1991A
Unable to identify CD-ROM format.
VFS: Can't find an ext2 filesystem on dev loop0.
security:  3 users, 6 roles, 1161 types, 135 bools, 1 sens, 256 cats
security:  55 classes, 38679 rules
SELinux:  Completing initialization.
SELinux:  Setting up existing superblocks.
SELinux: initialized (dev loop0, type squashfs), not configured for labeling
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev ramfs, type ramfs), uses genfs_contexts
SELinux: initialized (dev ramfs, type ramfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev eventpollfs, type eventpollfs), uses genfs_contexts
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
audit(1153657490.719:2): avc:  denied  { transition } for  pid=532 comm="loader" name="bash" dev=loop0 ino=1485 scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:anaconda_t:s0 tclass=process
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
raid5: automatically using best checksumming function: pIII_sse
   pIII_sse  :  4977.000 MB/sec
raid5: using function: pIII_sse (4977.000 MB/sec)
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid6: int32x1    757 MB/s
raid6: int32x2    835 MB/s
raid6: int32x4    779 MB/s
raid6: int32x8    598 MB/s
raid6: mmxx1     1687 MB/s
raid6: mmxx2     2966 MB/s
raid6: sse1x1    1550 MB/s
raid6: sse1x2    2705 MB/s
raid6: using algorithm sse1x2 (2705 MB/s)
md: raid6 personality registered for level 6
JFS: nTxBlock = 4030, nTxLock = 32241
SGI XFS with ACLs, security attributes, large block numbers, no debug enabled
SGI XFS Quota Management subsystem
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com
program anaconda is using a deprecated SCSI ioctl, please convert it to SG_IO
program anaconda is using a deprecated SCSI ioctl, please convert it to SG_IO
usb 1-6: new high speed USB device using ehci_hcd and address 3
usb 1-6: configuration #1 chosen from 1 choice
scsi3 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 3
usb-storage: waiting for device to settle before scanning
  Vendor: SanDisk   Model: Cruzer Mini       Rev: 0.1 
  Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sdc: 250879 512-byte hdwr sectors (128 MB)
sdc: Write Protect is off
sdc: Mode Sense: 03 00 00 00
sdc: assuming drive cache: write through
SCSI device sdc: 250879 512-byte hdwr sectors (128 MB)
sdc: Write Protect is off
sdc: Mode Sense: 03 00 00 00
sdc: assuming drive cache: write through
 sdc: sdc1
sd 3:0:0:0: Attached scsi removable disk sdc
usb-storage: device scan complete
md: md1 stopped.
md: bind<hdc2>
md: bind<hde2>
md: bind<hdg2>
md: bind<sda2>
md: bind<sdb2>
md: bind<hda2>
md: md1: raid array is not clean -- starting background reconstruction
raid6: device hda2 operational as raid disk 0
raid6: device sdb2 operational as raid disk 6
raid6: device sda2 operational as raid disk 5
raid6: device hdg2 operational as raid disk 4
raid6: device hde2 operational as raid disk 3
raid6: device hdc2 operational as raid disk 1
raid6: cannot start dirty degraded array for md1
RAID6 conf printout:
 --- rd:7 wd:6 fd:1
 disk 0, o:1, dev:hda2
 disk 1, o:1, dev:hdc2
 disk 3, o:1, dev:hde2
 disk 4, o:1, dev:hdg2
 disk 5, o:1, dev:sda2
 disk 6, o:1, dev:sdb2
raid6: failed to run raid set md1
md: pers->run() failed ...
ext3: No journal on filesystem on fd0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help! (was Re: Can't add disk to failed raid array)
  2006-07-23 12:32               ` Paul Waldo
@ 2006-07-24 18:26                 ` Dan Williams
  2006-07-24 18:35                   ` Paul Waldo
  2006-07-25 14:27                   ` Paul Waldo
  2006-07-24 23:24                 ` Neil Brown
  1 sibling, 2 replies; 22+ messages in thread
From: Dan Williams @ 2006-07-24 18:26 UTC (permalink / raw)
  To: Paul Waldo; +Cc: Neil Brown, linux-raid

On 7/23/06, Paul Waldo <pwaldo@waldoware.com> wrote:
> Here is the dmesg output.  No log files are created with the FC5 rescue disk.
> Thanks!
I ran into this as well, I believe at this point you want to set:

md-mod.start_dirty_degraded=1

as part of your boot options.  Understand you may see some filesystem
corruption as noted in the documentation.

See:
http://www.linux-m32r.org/lxr/http/source/Documentation/md.txt?v=2.6.17#L54

Regards,

Dan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help! (was Re: Can't add disk to failed raid array)
  2006-07-24 18:26                 ` Dan Williams
@ 2006-07-24 18:35                   ` Paul Waldo
  2006-07-24 19:06                     ` Dan Williams
  2006-07-25 14:27                   ` Paul Waldo
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Waldo @ 2006-07-24 18:35 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

Dan Williams wrote:
> On 7/23/06, Paul Waldo <pwaldo@waldoware.com> wrote:
>> Here is the dmesg output.  No log files are created with the FC5 
>> rescue disk.
>> Thanks!
> I ran into this as well, I believe at this point you want to set:
> 
> md-mod.start_dirty_degraded=1
> 
> as part of your boot options.  Understand you may see some filesystem
> corruption as noted in the documentation.
> 
> See:
> http://www.linux-m32r.org/lxr/http/source/Documentation/md.txt?v=2.6.17#L54
> 
> Regards,
> 
> Dan

I'll certainly give that a try later on, as I need physical access to 
the box.

The corruption part is worrisome...  When you did this, did you 
experience corruption?  I'm running RAID6 with 7 disks; presumably even 
with two disks out of whack, I should be in good shape...???

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help! (was Re: Can't add disk to failed raid array)
  2006-07-24 18:35                   ` Paul Waldo
@ 2006-07-24 19:06                     ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2006-07-24 19:06 UTC (permalink / raw)
  To: Paul Waldo; +Cc: Neil Brown, linux-raid

> I'll certainly give that a try later on, as I need physical access to
> the box.
>
> The corruption part is worrisome...  When you did this, did you
> experience corruption?  I'm running RAID6 with 7 disks; presumably even
> with two disks out of whack, I should be in good shape...???
>
I was running a 5 disk RAID-5 and did not detect any corruption.  Neil
correct me if I am wrong, but I believe that since your failure
occured without power loss that the chances for data corruption in
this case are small.

Dan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help!  (was Re: Can't add disk to failed raid array)
  2006-07-23 12:32               ` Paul Waldo
  2006-07-24 18:26                 ` Dan Williams
@ 2006-07-24 23:24                 ` Neil Brown
  2006-08-02 19:15                   ` Converting Ext3 to Ext3 under RAID 1 Dan Graham
  1 sibling, 1 reply; 22+ messages in thread
From: Neil Brown @ 2006-07-24 23:24 UTC (permalink / raw)
  To: Paul Waldo; +Cc: linux-raid

On Sunday July 23, pwaldo@waldoware.com wrote:
> Here is the dmesg output.  No log files are created with the FC5 rescue disk.  
> Thanks!
> 
> 
> 
> On Sunday 23 July 2006 07:59, Neil Brown wrote:
> > On Sunday July 23, pwaldo@waldoware.com wrote:
> > > At this point, I'd just be happy to be able to get the degraded array
> > > back up and running.  Is there any way to do that?  Thanks!
> >
> >  mdadm --assemble --force /dev/md1 /dev/hd[aceg]2 /dev/sd[ab]2
> >
....
> raid6: cannot start dirty degraded array for md1

Just checked, and this is fixed in the latest mdadm.  If you use 2.5.2
this --assemble --force will work.

Alternately booting with
  md-mod.start_dirty_degraded=1

or running
  echo 1 >  /sys/module/md_mod/parameters/start_dirty_degraded 

before the --assemble

should work.

NeilBrown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: In Trouble--Please Help! (was Re: Can't add disk to failed raid array)
  2006-07-24 18:26                 ` Dan Williams
  2006-07-24 18:35                   ` Paul Waldo
@ 2006-07-25 14:27                   ` Paul Waldo
  1 sibling, 0 replies; 22+ messages in thread
From: Paul Waldo @ 2006-07-25 14:27 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

Dan Williams wrote:
> On 7/23/06, Paul Waldo <pwaldo@waldoware.com> wrote:
>> Here is the dmesg output.  No log files are created with the FC5 
>> rescue disk.
>> Thanks!
> I ran into this as well, I believe at this point you want to set:
> 
> md-mod.start_dirty_degraded=1
> 
> as part of your boot options.  Understand you may see some filesystem
> corruption as noted in the documentation.
> 
> See:
> http://www.linux-m32r.org/lxr/http/source/Documentation/md.txt?v=2.6.17#L54
> 
> Regards,
> 
> Dan

Woo Hoo!  I am back in the running!
The md-mod.start_dirty_degraded enabled me to get the array running and 
I can now boot the machine.

Per Neil's comments, I adjusted the partitioning on the new drive 
(/dev/hdd2) to exactly match that of the other partitions in the array. 
  Success!  The array is now rebuilding with the new disk!!

Thanks for everyone's help on this--I'd be dead without it.  In return, 
maybe I can impart a lesson learned.  My big problem was that the new 
replacement disk, even though the same model as the original it was 
replacing, did not have the same geometry as the original.  It was short 
by 2 cylinders, which prevented it from being added to the array.  Next 
time I create an array of "identical" disks, I am going to keep a few 
cylinders on each one unused for just this type of problem.

Again, thanks for all the help!

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Converting Ext3 to Ext3 under RAID 1
  2006-07-24 23:24                 ` Neil Brown
@ 2006-08-02 19:15                   ` Dan Graham
  2006-08-02 19:28                     ` dean gaudet
                                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Dan Graham @ 2006-08-02 19:15 UTC (permalink / raw)
  To: linux-raid

Hello;
   I have an existing, active ext3 filesystem which I would like to convert to
a RAID 1 ext3 filesystem with minimal down time.  After casting about the web 
and experimenting some on a test system, I believe that I can accomplish this 
in the following manner.

   - Dismount the filesystem.
   - Shrink the filesystem to leave room for the RAID superblock at the end
     while leaving the partition size untouched (shrinking by 16 blocks seems to
     work )
   - Create a degraded array with only the partition carrying the shrunk ext3
     system
   - start the array and mount the array.
   - hot add the mirroring partitions.

The questions I have for those who know Linux-Raid better than I.

    Is this scheme even half-way sane?
    Is 16 blocks a large enough area?


Thanks in advance for any and all feed-back.

-- 
Daniel Graham
graham@molbio.uoregon.edu

541-346-5079 (voice)
541-346-4854 (FAX)

Institute of Molecular Biology
1229 University of Oregon
Eugene, OR 97403-1229


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Converting Ext3 to Ext3 under RAID 1
  2006-08-02 19:15                   ` Converting Ext3 to Ext3 under RAID 1 Dan Graham
@ 2006-08-02 19:28                     ` dean gaudet
  2006-08-02 19:31                     ` Paul Clements
  2006-08-02 22:38                     ` Robert Heinzmann
  2 siblings, 0 replies; 22+ messages in thread
From: dean gaudet @ 2006-08-02 19:28 UTC (permalink / raw)
  To: Dan Graham; +Cc: linux-raid

On Wed, 2 Aug 2006, Dan Graham wrote:

> Hello;
>   I have an existing, active ext3 filesystem which I would like to convert to
> a RAID 1 ext3 filesystem with minimal down time.  After casting about the web
> and experimenting some on a test system, I believe that I can accomplish this
> in the following manner.
> 
>   - Dismount the filesystem.
>   - Shrink the filesystem to leave room for the RAID superblock at the end
>     while leaving the partition size untouched (shrinking by 16 blocks seems
> to
>     work )
>   - Create a degraded array with only the partition carrying the shrunk ext3
>     system
>   - start the array and mount the array.
>   - hot add the mirroring partitions.
> 
> The questions I have for those who know Linux-Raid better than I.
> 
>    Is this scheme even half-way sane?

yes

>    Is 16 blocks a large enough area?

i always err on the side of caution and take a few meg off then resize it 
back up to full size after creating the degraded raid1.  (hmm maybe mdadm 
has some way to tell you how large the resulting partitions would be... 
i've never looked.)

you pretty much have to do this all using a recovery or live CD...

don't forget to rebuild your initrds... all of them including older 
kernels... otherwise there could be one of them still mounting the 
filesystem without using the md device name (and destroying integrity).

don't forget to set the second disk boot partition active and install grub 
so that you can boot from it when the first fails... (after you've 
mirrored the boot or root partition).

-dean

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Converting Ext3 to Ext3 under RAID 1
  2006-08-02 19:15                   ` Converting Ext3 to Ext3 under RAID 1 Dan Graham
  2006-08-02 19:28                     ` dean gaudet
@ 2006-08-02 19:31                     ` Paul Clements
  2006-08-03  7:50                       ` Michael Tokarev
  2006-08-02 22:38                     ` Robert Heinzmann
  2 siblings, 1 reply; 22+ messages in thread
From: Paul Clements @ 2006-08-02 19:31 UTC (permalink / raw)
  To: Dan Graham; +Cc: linux-raid

Dan Graham wrote:

>    Is this scheme even half-way sane?

Yes. It sounds correct.

>    Is 16 blocks a large enough area?

Maybe. The superblock will be between 64KB and 128KB from the end of the 
partition. This depends on the size of the partition:

SB_LOC = PART_SIZE - 64K - (PART_SIZE & (64K-1))

So, by 16 blocks, I assume you mean 16 filesystem blocks (which are 
generally 4KB for ext3). So as long as your partition ends exactly on a 
64KB boundary, you should be OK.

Personally, I would err on the safe side and just shorten the filesystem 
by 128KB. It's not like you're going to miss the extra 64KB.

--
Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Converting Ext3 to Ext3 under RAID 1
  2006-08-02 19:15                   ` Converting Ext3 to Ext3 under RAID 1 Dan Graham
  2006-08-02 19:28                     ` dean gaudet
  2006-08-02 19:31                     ` Paul Clements
@ 2006-08-02 22:38                     ` Robert Heinzmann
  2 siblings, 0 replies; 22+ messages in thread
From: Robert Heinzmann @ 2006-08-02 22:38 UTC (permalink / raw)
  To: Dan Graham; +Cc: linux-raid

Hi Dan,

see thread http://www.spinics.net/lists/raid/msg07742.html.

Regards,
Robert

Dan Graham schrieb:
> Hello;
>   I have an existing, active ext3 filesystem which I would like to 
> convert to
> a RAID 1 ext3 filesystem with minimal down time.  After casting about 
> the web and experimenting some on a test system, I believe that I can 
> accomplish this in the following manner.
>
>   - Dismount the filesystem.
>   - Shrink the filesystem to leave room for the RAID superblock at the 
> end
>     while leaving the partition size untouched (shrinking by 16 blocks 
> seems to
>     work )
>   - Create a degraded array with only the partition carrying the 
> shrunk ext3
>     system
>   - start the array and mount the array.
>   - hot add the mirroring partitions.
>
> The questions I have for those who know Linux-Raid better than I.
>
>    Is this scheme even half-way sane?
>    Is 16 blocks a large enough area?
>
>
> Thanks in advance for any and all feed-back.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Converting Ext3 to Ext3 under RAID 1
  2006-08-02 19:31                     ` Paul Clements
@ 2006-08-03  7:50                       ` Michael Tokarev
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Tokarev @ 2006-08-03  7:50 UTC (permalink / raw)
  To: Paul Clements; +Cc: Dan Graham, linux-raid

Paul Clements wrote:
>>    Is 16 blocks a large enough area?
> 
> Maybe. The superblock will be between 64KB and 128KB from the end of the
> partition. This depends on the size of the partition:
> 
> SB_LOC = PART_SIZE - 64K - (PART_SIZE & (64K-1))
> 
> So, by 16 blocks, I assume you mean 16 filesystem blocks (which are
> generally 4KB for ext3). So as long as your partition ends exactly on a
> 64KB boundary, you should be OK.
> 
> Personally, I would err on the safe side and just shorten the filesystem
> by 128KB. It's not like you're going to miss the extra 64KB.

Or, better yet, shrink it by 1Mb or even 10Mb, whatever, convert
to raid, and - the point - resize it to the max size of the raid
device (ie, don't give "size" argument to resize2fs).  This way,
you will be both safe and will use 100% of the available size.

/mjt

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2006-08-03  7:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-16  0:56 Can't add disk to failed raid array Paul Waldo
2006-07-16 10:19 ` Neil Brown
2006-07-16 12:24   ` Paul Waldo
2006-07-18  5:36     ` Neil Brown
2006-07-23 11:04       ` In Trouble--Please Help! (was Re: Can't add disk to failed raid array) Paul Waldo
2006-07-23 11:25         ` Neil Brown
2006-07-23 11:50           ` Paul Waldo
2006-07-23 11:53           ` Paul Waldo
2006-07-23 11:59             ` Neil Brown
2006-07-23 12:32               ` Paul Waldo
2006-07-24 18:26                 ` Dan Williams
2006-07-24 18:35                   ` Paul Waldo
2006-07-24 19:06                     ` Dan Williams
2006-07-25 14:27                   ` Paul Waldo
2006-07-24 23:24                 ` Neil Brown
2006-08-02 19:15                   ` Converting Ext3 to Ext3 under RAID 1 Dan Graham
2006-08-02 19:28                     ` dean gaudet
2006-08-02 19:31                     ` Paul Clements
2006-08-03  7:50                       ` Michael Tokarev
2006-08-02 22:38                     ` Robert Heinzmann
     [not found] ` <200607160913.32005.pwaldo@waldoware.com>
     [not found]   ` <Pine.LNX.4.62.0607161524170.7520@uplift.swm.pp.se>
2006-07-16 14:39     ` Can't add disk to failed raid array Paul Waldo
2006-07-17 18:36 ` Paul Waldo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).