mdadm drives me crazy

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* mdadm drives me crazy
@ 2004-12-01 11:22 Fabrice LORRAIN
  2004-12-01 12:53 ` Fabrice LORRAIN
  2004-12-01 17:28 ` Guy
  0 siblings, 2 replies; 4+ messages in thread
From: Fabrice LORRAIN @ 2004-12-01 11:22 UTC (permalink / raw)
  To: linux-raid

Hi all,

Following a crash of one of our raid5 pool last week I discover that 
most of our servers shows the same pb. Up to now I didn't find the 
explanation. So if someone from the list could explain the following 
output and more particularly why the "failed device" after an mdadm 
--create with 2.4.x kernel :

dd if=/dev/zero of=part[1-5] bs=1k count=20000
losetup /dev/loop[0-5] part[0-5]

$ uname -a
Linux fabtest1 2.4.27-1-686 #1 Fri Sep 3 06:28:00 UTC 2004 i686 GNU/Linux

debian kernel on this box but all the other test I did where with a 
vanilla kernel.

$ sudo mdadm --version
mdadm - v1.7.0 - 11 August 2004
The box is i386 with an up to date pre-sarge (debian).
(same pb with 0.7.2 on a woody box, with 1.4 woody backport and mdadm 
1.8.1 doesn't start the building of the raid pool on an mdadm --create)

$ /sbin/lsmod
Module                  Size  Used by    Not tainted
raid5                  17320   1
md                     60064   1 [raid5]
xor                     8932   0 [raid5]
loop                    9112  18
input                   3648   0 (autoclean)
i810                   62432   0
agpgart                46244   6 (autoclean)
apm                     9868   2 (autoclean)
af_packet              13032   1 (autoclean)
dm-mod                 46808   0 (unused)
i810_audio             24444   0
ac97_codec             13300   0 [i810_audio]
soundcore               3940   2 [i810_audio]
3c59x                  27152   1
rtc                     6440   0 (autoclean)
ext3                   81068   2 (autoclean)
jbd                    42468   2 (autoclean) [ext3]
ide-detect               288   0 (autoclean) (unused)
ide-disk               16736   3 (autoclean)
piix                    9096   1 (autoclean)
ide-core              108504   3 (autoclean) [ide-detect ide-disk piix]
unix                   14928  62 (autoclean)

$ sudo mdadm --zero-superblock /dev/loop[0-5]

$ sudo mdadm --create /dev/md0 --level=5 --raid-devices=6 /dev/loop[0-5]
build the array correctly and gives (once the build is finished) :

$ cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 [dev 07:05][5] [dev 07:04][4] [dev 07:03][3] [dev 
07:02][2] [dev 07:01][1] [dev 07:00][0]
       99520 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]

$ $ sudo mdadm -D /dev/md0
/dev/md0:
         Version : 00.90.00
   Creation Time : Wed Dec  1 11:39:43 2004
      Raid Level : raid5
      Array Size : 99520 (97.19 MiB 101.91 MB)
     Device Size : 19904 (19.44 MiB 20.38 MB)
    Raid Devices : 6
   Total Devices : 7
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Wed Dec  1 11:40:29 2004
           State : dirty
  Active Devices : 6
Working Devices : 6
  Failed Devices : 1
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 64K

            UUID : 604b72e9:86d7ecd6:578bfb8c:ea071bbd
          Events : 0.1

     Number   Major   Minor   RaidDevice State
        0       7        0        0      active sync   /dev/loop0
        1       7        1        1      active sync   /dev/loop1
        2       7        2        2      active sync   /dev/loop2
        3       7        3        3      active sync   /dev/loop3
        4       7        4        4      active sync   /dev/loop4
        5       7        5        5      active sync   /dev/loop5

Why in hell do I get a Failed devices ? And what is the real status of 
the raid5 pool ?

I have this pb with raid5 pool on hd and sd hard drives with <> vanilla 
2.4.x kernel. 2.6.x doesn't show this feature.

raid1 pool doesn't have this problem either.

@+,

	Fab

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdadm drives me crazy
  2004-12-01 11:22 mdadm drives me crazy Fabrice LORRAIN
@ 2004-12-01 12:53 ` Fabrice LORRAIN
  2004-12-01 21:38   ` Neil Brown
  2004-12-01 17:28 ` Guy
  1 sibling, 1 reply; 4+ messages in thread
From: Fabrice LORRAIN @ 2004-12-01 12:53 UTC (permalink / raw)
  To: linux-raid; +Cc: Fabrice.Lorrain

Fabrice LORRAIN wrote:
> Hi all,
> 
...
> $ sudo mdadm --create /dev/md0 --level=5 --raid-devices=6 /dev/loop[0-5]

$ sudo mdadm --create /dev/md0 --force --level=5 --raid-devices=6 
/dev/loop[0-5]

Seems to give what I expected (a raid5 pool with 6 devices, no spare).

 From mdadm man page :
"...When creating a RAID5 array, mdadm will automatically create a 
degraded array with an extra spare drive.  This is because building the 
spare into a  degraded  array  is in general faster than resyncing the 
parity on a non-degraded, but not clean, array.  This feature can be 
over-ridden with the -I --force option."

"-I" doesn't seems to be understood by mdadm. Leftover ?

I don't understand what the previous extract from the man page means. My 
understanding is that the default behaviour of mdadm is to create a 
raid5 pool in degraded mode aka with a missing drive ? Is this correct ?

after
$ sudo mdadm --create /dev/md0 --force --level=5 --raid-devices=6 
/dev/loop[0-5]

the state of the array is dirty. Why ?

$ sudo mdadm --stop /dev/md0 followed by
$ sudo mdadm --examine /dev/loop[0-5]

gives a clean state for each device but

$ sudo mdadm --assemble /dev/md0 /dev/loop[0-5] keeps the dirty state of 
the array.

Thanks,

	Fab

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdadm drives me crazy
  2004-12-01 12:53 ` Fabrice LORRAIN
@ 2004-12-01 21:38   ` Neil Brown
  0 siblings, 0 replies; 4+ messages in thread
From: Neil Brown @ 2004-12-01 21:38 UTC (permalink / raw)
  To: Fabrice LORRAIN; +Cc: linux-raid

On Wednesday December 1, Fabrice.Lorrain@univ-mlv.fr wrote:
> Fabrice LORRAIN wrote:
> > Hi all,
> > 
> ...
> > $ sudo mdadm --create /dev/md0 --level=5 --raid-devices=6 /dev/loop[0-5]
> 
> 
> $ sudo mdadm --create /dev/md0 --force --level=5 --raid-devices=6 
> /dev/loop[0-5]
> 
> Seems to give what I expected (a raid5 pool with 6 devices, no spare).
> 
>  From mdadm man page :
> "...When creating a RAID5 array, mdadm will automatically create a 
> degraded array with an extra spare drive.  This is because building the 
> spare into a  degraded  array  is in general faster than resyncing the 
> parity on a non-degraded, but not clean, array.  This feature can be 
> over-ridden with the -I --force option."
> 
> "-I" doesn't seems to be understood by mdadm. Leftover ?

The "-I" is a typo in the man page.  It should be ".I", which would
set the "--force" in italics.

> 
> I don't understand what the previous extract from the man page means. My 
> understanding is that the default behaviour of mdadm is to create a 
> raid5 pool in degraded mode aka with a missing drive ? Is this
> correct ?

Yes.  That is correct.  
It does this because (as the man page says) you get a fully in-sync
raid5 array - with all parity blocks correct - sooner.
> 
> after
> $ sudo mdadm --create /dev/md0 --force --level=5 --raid-devices=6 
> /dev/loop[0-5]
> 
> the state of the array is dirty. Why ?
> 
> $ sudo mdadm --stop /dev/md0 followed by
> $ sudo mdadm --examine /dev/loop[0-5]
> 
> gives a clean state for each device but
> 
> $ sudo mdadm --assemble /dev/md0 /dev/loop[0-5] keeps the dirty state of 
> the array.

"dirty" should be changed to say "active", and probably will be in the
next release of mdadm.

NeilBrown

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: mdadm drives me crazy
  2004-12-01 11:22 mdadm drives me crazy Fabrice LORRAIN
  2004-12-01 12:53 ` Fabrice LORRAIN
@ 2004-12-01 17:28 ` Guy
  1 sibling, 0 replies; 4+ messages in thread
From: Guy @ 2004-12-01 17:28 UTC (permalink / raw)
  To: 'Fabrice LORRAIN', linux-raid

This is normal (IMO) for a 2.4 kernel.
I think it has been fixed in the 2.6 kernel.  But I have never used the
newer kernel, so I can't confirm that.  It may have been a newer version of
mdadm, not the kernel, not sure.

My numbers are much worse!
I have 14 disks and 1 spare.
   Raid Devices : 14
  Total Devices : 13

 Active Devices : 14
Working Devices : 12
 Failed Devices : 1
  Spare Devices : 1

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       8      161        1      active sync   /dev/sdk1
       2       8       65        2      active sync   /dev/sde1
       3       8      177        3      active sync   /dev/sdl1
       4       8       81        4      active sync   /dev/sdf1
       5       8      193        5      active sync   /dev/sdm1
       6       8       97        6      active sync   /dev/sdg1
       7       8      209        7      active sync   /dev/sdn1
       8       8      113        8      active sync   /dev/sdh1
       9       8      225        9      active sync   /dev/sdo1
      10       8      129       10      active sync   /dev/sdi1
      11       8      241       11      active sync   /dev/sdp1
      12       8      145       12      active sync   /dev/sdj1
      13       8       33       13      active sync   /dev/sdc1
      14      65        1       14        /dev/sdq1

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Fabrice LORRAIN
Sent: Wednesday, December 01, 2004 6:22 AM
To: linux-raid@vger.kernel.org
Subject: mdadm drives me crazy

Hi all,

Following a crash of one of our raid5 pool last week I discover that 
most of our servers shows the same pb. Up to now I didn't find the 
explanation. So if someone from the list could explain the following 
output and more particularly why the "failed device" after an mdadm 
--create with 2.4.x kernel :

dd if=/dev/zero of=part[1-5] bs=1k count=20000
losetup /dev/loop[0-5] part[0-5]

$ uname -a
Linux fabtest1 2.4.27-1-686 #1 Fri Sep 3 06:28:00 UTC 2004 i686 GNU/Linux

debian kernel on this box but all the other test I did where with a 
vanilla kernel.

$ sudo mdadm --version
mdadm - v1.7.0 - 11 August 2004
The box is i386 with an up to date pre-sarge (debian).
(same pb with 0.7.2 on a woody box, with 1.4 woody backport and mdadm 
1.8.1 doesn't start the building of the raid pool on an mdadm --create)

$ /sbin/lsmod
Module                  Size  Used by    Not tainted
raid5                  17320   1
md                     60064   1 [raid5]
xor                     8932   0 [raid5]
loop                    9112  18
input                   3648   0 (autoclean)
i810                   62432   0
agpgart                46244   6 (autoclean)
apm                     9868   2 (autoclean)
af_packet              13032   1 (autoclean)
dm-mod                 46808   0 (unused)
i810_audio             24444   0
ac97_codec             13300   0 [i810_audio]
soundcore               3940   2 [i810_audio]
3c59x                  27152   1
rtc                     6440   0 (autoclean)
ext3                   81068   2 (autoclean)
jbd                    42468   2 (autoclean) [ext3]
ide-detect               288   0 (autoclean) (unused)
ide-disk               16736   3 (autoclean)
piix                    9096   1 (autoclean)
ide-core              108504   3 (autoclean) [ide-detect ide-disk piix]
unix                   14928  62 (autoclean)

$ sudo mdadm --zero-superblock /dev/loop[0-5]

$ sudo mdadm --create /dev/md0 --level=5 --raid-devices=6 /dev/loop[0-5]
build the array correctly and gives (once the build is finished) :

$ cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 [dev 07:05][5] [dev 07:04][4] [dev 07:03][3] [dev 
07:02][2] [dev 07:01][1] [dev 07:00][0]
       99520 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]

$ $ sudo mdadm -D /dev/md0
/dev/md0:
         Version : 00.90.00
   Creation Time : Wed Dec  1 11:39:43 2004
      Raid Level : raid5
      Array Size : 99520 (97.19 MiB 101.91 MB)
     Device Size : 19904 (19.44 MiB 20.38 MB)
    Raid Devices : 6
   Total Devices : 7
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Wed Dec  1 11:40:29 2004
           State : dirty
  Active Devices : 6
Working Devices : 6
  Failed Devices : 1
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 64K

            UUID : 604b72e9:86d7ecd6:578bfb8c:ea071bbd
          Events : 0.1

     Number   Major   Minor   RaidDevice State
        0       7        0        0      active sync   /dev/loop0
        1       7        1        1      active sync   /dev/loop1
        2       7        2        2      active sync   /dev/loop2
        3       7        3        3      active sync   /dev/loop3
        4       7        4        4      active sync   /dev/loop4
        5       7        5        5      active sync   /dev/loop5

Why in hell do I get a Failed devices ? And what is the real status of 
the raid5 pool ?

I have this pb with raid5 pool on hd and sd hard drives with <> vanilla 
2.4.x kernel. 2.6.x doesn't show this feature.

raid1 pool doesn't have this problem either.

@+,

	Fab
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-12-01 21:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-01 11:22 mdadm drives me crazy Fabrice LORRAIN
2004-12-01 12:53 ` Fabrice LORRAIN
2004-12-01 21:38   ` Neil Brown
2004-12-01 17:28 ` Guy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).