linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Wierd critical node problem
@ 2008-06-10  7:24 Wayne Gemmell
  2008-06-10  8:15 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Wayne Gemmell @ 2008-06-10  7:24 UTC (permalink / raw)
  To: linux-raid

Hi all

I run a 4 disk array with 6 RAID 1 partitions. This is a very simple 
configurations which has served me well through multiple drive failures. I 
now have a wierd problem. One of the disks has become a critical node. My 
server does not boot without this disk no matter which disk is my primary 
boot disk (bios). I've changed the order of the disks and it makes no 
difference. 

When this disk is not in booting from some of the other disks gives me a 
warning that there is an invalid superblock magic on sda. They all end up 
with an error on stdin and modprobe failing.


Any ideas how to resolve this? I'm leaning towards nukeing sd[acd] and 
re-adding them.


-- 
Regards
Wayne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wierd critical node problem
  2008-06-10  7:24 Wierd critical node problem Wayne Gemmell
@ 2008-06-10  8:15 ` NeilBrown
  2008-06-10  8:30   ` Wayne Gemmell
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2008-06-10  8:15 UTC (permalink / raw)
  To: wayne; +Cc: linux-raid

On Tue, June 10, 2008 5:24 pm, Wayne Gemmell wrote:
> Hi all
>
> I run a 4 disk array with 6 RAID 1 partitions. This is a very simple
> configurations which has served me well through multiple drive failures. I
> now have a wierd problem. One of the disks has become a critical node. My
> server does not boot without this disk no matter which disk is my primary
> boot disk (bios). I've changed the order of the disks and it makes no
> difference.
>
> When this disk is not in booting from some of the other disks gives me a
> warning that there is an invalid superblock magic on sda. They all end up
> with an error on stdin and modprobe failing.
>
>
> Any ideas how to resolve this? I'm leaning towards nukeing sd[acd] and
> re-adding them.

I suggest you provide lots more details.
Probably
  mdadm -Dsv
and
  mdadm -Esv
would be a good start.

NeilBrown


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wierd critical node problem
  2008-06-10  8:15 ` NeilBrown
@ 2008-06-10  8:30   ` Wayne Gemmell
  2008-06-10 10:06     ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Wayne Gemmell @ 2008-06-10  8:30 UTC (permalink / raw)
  To: linux-raid; +Cc: NeilBrown

Sure thingk.

On Tuesday 10 June 2008 10:15:22 you wrote:
> I suggest you provide lots more details.
> Probably
>   mdadm -Dsv
ARRAY /dev/md5 level=raid1 num-devices=2 
UUID=6868a02e:0e985748:cae821e2:1cf91e6d
   devices=/dev/sdc2,/dev/sdd2
ARRAY /dev/md0 level=raid1 num-devices=4 
UUID=222850cc:3ee166b9:9e71a84f:e86d40a1
   devices=/dev/sdb1,/dev/sda1,/dev/sdd1,/dev/sdc1
ARRAY /dev/md6 level=raid1 num-devices=2 
UUID=bfa44b7a:d6a2d5fc:cae821e2:1cf91e6d
   devices=/dev/sdb2,/dev/sda2
ARRAY /dev/md1 level=raid1 num-devices=4 
UUID=5f6e694f:1d5441a3:7c5e3c07:b6a0267e
   devices=/dev/sdb3,/dev/sdd3,/dev/sdc3,/dev/sda3
ARRAY /dev/md2 level=raid1 num-devices=4 
UUID=020b3642:b32a5fda:ebae4acf:da43fee2
   devices=/dev/sdc5,/dev/sdd5,/dev/sda5,/dev/sdb5
ARRAY /dev/md3 level=raid1 num-devices=4 
UUID=da7d0fc4:ba7f7bd0:bfe35e68:eec9a5cb
   devices=/dev/sda6,/dev/sdc6,/dev/sdd6,/dev/sdb6
ARRAY /dev/md4 level=raid1 num-devices=4 
UUID=5785dcb6:10ba80a4:b169e59f:d80bc484
   devices=/dev/sdc7,/dev/sda7,/dev/sdb7,/dev/sdd7


> and
>   mdadm -Esv
> would be a good start.
ARRAY /dev/md6 level=raid1 num-devices=2 
UUID=bfa44b7a:d6a2d5fc:cae821e2:1cf91e6d
   devices=/dev/sdb2,/dev/sda2
ARRAY /dev/md0 level=raid1 num-devices=4 
UUID=222850cc:3ee166b9:9e71a84f:e86d40a1
   devices=/dev/sdd1,/dev/sdc1,/dev/sdb1,/dev/sda1
ARRAY /dev/md5 level=raid1 num-devices=2 
UUID=6868a02e:0e985748:cae821e2:1cf91e6d
   devices=/dev/sdd2,/dev/sdc2
ARRAY /dev/md1 level=raid1 num-devices=4 
UUID=5f6e694f:1d5441a3:7c5e3c07:b6a0267e
   devices=/dev/sdd3,/dev/sdc3,/dev/sdb3,/dev/sda3
ARRAY /dev/md2 level=raid1 num-devices=4 
UUID=020b3642:b32a5fda:ebae4acf:da43fee2
   devices=/dev/sdd5,/dev/sdc5,/dev/sdb5,/dev/sda5
ARRAY /dev/md3 level=raid1 num-devices=4 
UUID=da7d0fc4:ba7f7bd0:bfe35e68:eec9a5cb
   devices=/dev/sdd6,/dev/sdc6,/dev/sdb6,/dev/sda6
ARRAY /dev/md4 level=raid1 num-devices=4 
UUID=5785dcb6:10ba80a4:b169e59f:d80bc484
   devices=/dev/sdd7,/dev/sdc7,/dev/sdb7,/dev/sda7

I have found the following in my logs,
Jun  9 17:01:50 lloyd kernel: [   52.912904] mdadm[2656]: segfault at 
0000000000000004 rip 000000000041724c rsp 00007ffff99d9b30 error 4

-- 
Regards
Wayne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wierd critical node problem
  2008-06-10  8:30   ` Wayne Gemmell
@ 2008-06-10 10:06     ` NeilBrown
  2008-06-10 10:34       ` Wayne Gemmell
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2008-06-10 10:06 UTC (permalink / raw)
  To: wayne; +Cc: linux-raid

On Tue, June 10, 2008 6:30 pm, Wayne Gemmell wrote:
> Sure thingk.
>
> On Tuesday 10 June 2008 10:15:22 you wrote:
>> I suggest you provide lots more details.
>> Probably
>>   mdadm -Dsv

Damn, I meant to say "-Dsvv" (2 v's) but it doens't really matter,
I think that is a good enough picture.
However....

> I have found the following in my logs,
> Jun  9 17:01:50 lloyd kernel: [   52.912904] mdadm[2656]: segfault at
> 0000000000000004 rip 000000000041724c rsp 00007ffff99d9b30 error 4
>

I suspect this is the real problem.
Which version of mdadm (mdadm -V)?

If these arrays are being assembled by the initrd, you would need
to find out what mdadm is in the initrd, though it is probably
the same as in /sbin.  What distro.  What kernel version?

NeilBrown


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wierd critical node problem
  2008-06-10 10:06     ` NeilBrown
@ 2008-06-10 10:34       ` Wayne Gemmell
  2008-06-19  5:07         ` Neil Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Wayne Gemmell @ 2008-06-10 10:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Tuesday 10 June 2008 12:06:10 NeilBrown wrote:
> On Tue, June 10, 2008 6:30 pm, Wayne Gemmell wrote:
> > Sure thingk.
> >
> > On Tuesday 10 June 2008 10:15:22 you wrote:
> >> I suggest you provide lots more details.
> >> Probably
> >>   mdadm -Dsv
>
> Damn, I meant to say "-Dsvv" (2 v's) but it doens't really matter,
> I think that is a good enough picture.
> However....
Just for thouroughness....

/dev/md5:
        Version : 00.90.03
  Creation Time : Mon Jul 30 13:41:01 2007
     Raid Level : raid1
     Array Size : 979840 (957.04 MiB 1003.36 MB)
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 5
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 10:08:45 2008
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 6868a02e:0e985748:cae821e2:1cf91e6d (local to host lloyd)
         Events : 0.84

    Number   Major   Minor   RaidDevice State
       0       8       34        0      active sync   /dev/sdc2
       1       8       50        1      active sync   /dev/sdd2
/dev/md0:
        Version : 00.90.03
  Creation Time : Wed Aug 30 08:17:25 2006
     Raid Level : raid1
     Array Size : 489856 (478.46 MiB 501.61 MB)
  Used Dev Size : 489856 (478.46 MiB 501.61 MB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Jun  9 18:02:55 2008
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           UUID : 222850cc:3ee166b9:9e71a84f:e86d40a1
         Events : 0.1598

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8        1        1      active sync   /dev/sda1
       2       8       49        2      active sync   /dev/sdd1
       3       8       33        3      active sync   /dev/sdc1
/dev/md6:
        Version : 00.90.03
  Creation Time : Mon Jul 30 13:46:12 2007
     Raid Level : raid1
     Array Size : 979840 (957.04 MiB 1003.36 MB)
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 10:08:46 2008
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : bfa44b7a:d6a2d5fc:cae821e2:1cf91e6d (local to host lloyd)
         Events : 0.5980

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8        2        1      active sync   /dev/sda2
/dev/md1:
        Version : 00.90.03
  Creation Time : Wed Aug 30 08:18:01 2006
     Raid Level : raid1
     Array Size : 4883648 (4.66 GiB 5.00 GB)
  Used Dev Size : 4883648 (4.66 GiB 5.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 12:24:23 2008
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           UUID : 5f6e694f:1d5441a3:7c5e3c07:b6a0267e
         Events : 0.17895140

    Number   Major   Minor   RaidDevice State
       0       8       19        0      active sync   /dev/sdb3
       1       8       51        1      active sync   /dev/sdd3
       2       8       35        2      active sync   /dev/sdc3
       3       8        3        3      active sync   /dev/sda3
/dev/md2:
        Version : 00.90.03
  Creation Time : Wed Aug 30 08:18:46 2006
     Raid Level : raid1
     Array Size : 9767424 (9.31 GiB 10.00 GB)
  Used Dev Size : 9767424 (9.31 GiB 10.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 12:24:08 2008
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           UUID : 020b3642:b32a5fda:ebae4acf:da43fee2
         Events : 0.15926826

    Number   Major   Minor   RaidDevice State
       0       8       37        0      active sync   /dev/sdc5
       1       8       53        1      active sync   /dev/sdd5
       2       8        5        2      active sync   /dev/sda5
       3       8       21        3      active sync   /dev/sdb5
/dev/md3:
        Version : 00.90.03
  Creation Time : Wed Aug 30 08:19:37 2006
     Raid Level : raid1
     Array Size : 1951744 (1906.32 MiB 1998.59 MB)
  Used Dev Size : 1951744 (1906.32 MiB 1998.59 MB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 12:24:03 2008
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           UUID : da7d0fc4:ba7f7bd0:bfe35e68:eec9a5cb
         Events : 0.1515534

    Number   Major   Minor   RaidDevice State
       0       8        6        0      active sync   /dev/sda6
       1       8       38        1      active sync   /dev/sdc6
       2       8       54        2      active sync   /dev/sdd6
       3       8       22        3      active sync   /dev/sdb6
/dev/md4:
        Version : 00.90.03
  Creation Time : Fri Sep 15 11:20:47 2006
     Raid Level : raid1
     Array Size : 138215104 (131.81 GiB 141.53 GB)
  Used Dev Size : 138215104 (131.81 GiB 141.53 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 4
    Persistence : Superblock is persistent

    Update Time : Tue Jun 10 12:24:27 2008
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

           UUID : 5785dcb6:10ba80a4:b169e59f:d80bc484
         Events : 0.28934180

    Number   Major   Minor   RaidDevice State
       0       8       39        0      active sync   /dev/sdc7
       1       8        7        1      active sync   /dev/sda7
       2       8       23        2      active sync   /dev/sdb7
       3       8       55        3      active sync   /dev/sdd7


>
> > I have found the following in my logs,
> > Jun  9 17:01:50 lloyd kernel: [   52.912904] mdadm[2656]: segfault at
> > 0000000000000004 rip 000000000041724c rsp 00007ffff99d9b30 error 4
>
> I suspect this is the real problem.
> Which version of mdadm (mdadm -V)?
mdadm - v2.6.2 - 21st May 2007

>
> If these arrays are being assembled by the initrd, you would need
> to find out what mdadm is in the initrd, though it is probably
> the same as in /sbin.  What distro.  What kernel version?
I'm running Ubuntu Gutsy running 2.6.22-14-server kernel. I May have an old 
version of mdadm in initrd so I've regenerated it now. I'll only really get 
to test it again tomorrow.


-- 
Regards
Wayne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wierd critical node problem
  2008-06-10 10:34       ` Wayne Gemmell
@ 2008-06-19  5:07         ` Neil Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Brown @ 2008-06-19  5:07 UTC (permalink / raw)
  To: wayne; +Cc: linux-raid

On Tuesday June 10, wayne@flashmedia.co.za wrote:
> >
> > > I have found the following in my logs,
> > > Jun  9 17:01:50 lloyd kernel: [   52.912904] mdadm[2656]: segfault at
> > > 0000000000000004 rip 000000000041724c rsp 00007ffff99d9b30 error 4
> >
> > I suspect this is the real problem.
> > Which version of mdadm (mdadm -V)?
> mdadm - v2.6.2 - 21st May 2007
> 

I don't know of any segfault problems with this version.  It might be
worth trying a newer release:  2.6.4 or 2.6.7.

If you are still having problems, can you get the exact text of the
error messages, they might show something useful.

NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-06-19  5:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-10  7:24 Wierd critical node problem Wayne Gemmell
2008-06-10  8:15 ` NeilBrown
2008-06-10  8:30   ` Wayne Gemmell
2008-06-10 10:06     ` NeilBrown
2008-06-10 10:34       ` Wayne Gemmell
2008-06-19  5:07         ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).