raid 5 created with 7 out of 8

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid 5 created with 7 out of 8
@ 2005-01-10  1:38 Bjørn Eikeland
  2005-01-10  3:24 ` Guy
  0 siblings, 1 reply; 5+ messages in thread
From: Bjørn Eikeland @ 2005-01-10  1:38 UTC (permalink / raw)
  To: linux-raid

Hi, I'm trying to set up a raid5 array using 8 ide drives (
/dev/hd[e-l] ) but I'm having a hard time.

I'm using slackware 10, kernel 2.4.26 and madm 1.8.1 (downloading
2.4.28 overnight now)

The problem is mdadm creates the array with 7 of 8 drives up and
running and the last as a spare and does not start recovering with the
spare. And it will not let me remove it and re-add it. Below follows a
script output of the whole thing (less repartitioning the drives and
zero'ing any remaining superblocks)

Any help will be greatly appreciated.
-thanks

root@filebear:~# mdadm -C /dev/md0 -l5 -n8 -c512 /dev/hd[e-l]1
VERS = 9000
mdadm: array /dev/md0 started.
root@filebear:~# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md0 : active raid5 hdl1[8] hdk1[6] hdj1[5] hdi1[4] hdh1[3] hdg1[2]
hdf1[1] hde1[0]
      1094017792 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
      
unused devices: <none>
root@filebear:~# mdadm /dev/md0 -f /dev/hdl1
mdadm: set /dev/hdl1 faulty in /dev/md0
root@filebear:~# mdadm /dev/md0 -r /dev/hdl
mdadm: hot removed /dev/hdl1
root@filebear:~# mdadm /dev/md0 -a /dev/hdl1
mdadm: hot add failed for /dev/hdl1: No space left on device
root@filebear:~# mdadm /dev/md0 -f /dev/hde1
mdadm: set /dev/hde1 faulty in /dev/md0
root@filebear:~# mdadm /dev/md0 -r /dev/hde1
mdadm: hot removed /dev/hde1
root@filebear:~# mdadm /dev/md0 -a /dev/hde1
mdadm: hot add failed for /dev/hde1: No space left on device

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: raid 5 created with 7 out of 8
  2005-01-10  1:38 raid 5 created with 7 out of 8 Bjørn Eikeland
@ 2005-01-10  3:24 ` Guy
  2005-01-10  5:27   ` Minor bugs in "mdadm --monitor --scan &" Guy
       [not found]   ` <f4146e7c05011001554b39fd9f@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Guy @ 2005-01-10  3:24 UTC (permalink / raw)
  To: 'Bjørn Eikeland', linux-raid

From 1.8.1:
This is a "development" release of mdadm.  It should *not* be
considered stable and should be used primarily for testing.
The current "stable" version is 1.8.0.

Your email shows "VERS = 9000".  Was that a command line option?  Or output
from mdadm?

The only other odd thing I see...  You have the largest chunk size I have
seen (-c512).  But I don't know of any limits.

I did create an array with this command line.  No problems.
mdadm -C /dev/md3 -l5 -n8 -c512 /dev/ram[0-7]

from cat /proc/mdstat:
md3 : active raid5 [dev 01:07][7] [dev 01:06][6] [dev 01:05][5] [dev
01:04][4] [dev 01:03][3] [dev 01:02][2] [dev 01:01][1] [dev 01:00][0]
      25088 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]

Send output of:
mdadm -D /dev/md0

I am using mdadm V1.8.0 and kernel 2.4.28.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Bjørn Eikeland
Sent: Sunday, January 09, 2005 8:39 PM
To: linux-raid@vger.kernel.org
Subject: raid 5 created with 7 out of 8

Hi, I'm trying to set up a raid5 array using 8 ide drives (
/dev/hd[e-l] ) but I'm having a hard time.

I'm using slackware 10, kernel 2.4.26 and madm 1.8.1 (downloading
2.4.28 overnight now)

The problem is mdadm creates the array with 7 of 8 drives up and
running and the last as a spare and does not start recovering with the
spare. And it will not let me remove it and re-add it. Below follows a
script output of the whole thing (less repartitioning the drives and
zero'ing any remaining superblocks)

Any help will be greatly appreciated.
-thanks

root@filebear:~# mdadm -C /dev/md0 -l5 -n8 -c512 /dev/hd[e-l]1
VERS = 9000
mdadm: array /dev/md0 started.
root@filebear:~# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md0 : active raid5 hdl1[8] hdk1[6] hdj1[5] hdi1[4] hdh1[3] hdg1[2]
hdf1[1] hde1[0]
      1094017792 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
      
unused devices: <none>
root@filebear:~# mdadm /dev/md0 -f /dev/hdl1
mdadm: set /dev/hdl1 faulty in /dev/md0
root@filebear:~# mdadm /dev/md0 -r /dev/hdl
mdadm: hot removed /dev/hdl1
root@filebear:~# mdadm /dev/md0 -a /dev/hdl1
mdadm: hot add failed for /dev/hdl1: No space left on device
root@filebear:~# mdadm /dev/md0 -f /dev/hde1
mdadm: set /dev/hde1 faulty in /dev/md0
root@filebear:~# mdadm /dev/md0 -r /dev/hde1
mdadm: hot removed /dev/hde1
root@filebear:~# mdadm /dev/md0 -a /dev/hde1
mdadm: hot add failed for /dev/hde1: No space left on device
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Minor bugs in "mdadm --monitor --scan &"
  2005-01-10  3:24 ` Guy
@ 2005-01-10  5:27   ` Guy
  2005-01-10  7:00     ` Guy
       [not found]   ` <f4146e7c05011001554b39fd9f@mail.gmail.com>
  1 sibling, 1 reply; 5+ messages in thread
From: Guy @ 2005-01-10  5:27 UTC (permalink / raw)
  To: linux-raid

I have mdadm configured to run a script when an event occurs.

I start mdadm like this:
mdadm --monitor --scan&

That is from a script in /etc/init.d

My /etc/mdadm.conf file has this:
PROGRAM /root/bin/handle-mdadm-events
Other lines not related.

The script has these 2 lines:
echo '$1'=$1 '$2'=$2 '$3'=$3 '$4'=$4 >> /root/bin/handle-mdadm-events.log
(date;cat /proc/mdstat;mdadm --detail $2)|mail -s "md event: $1 $2 $3"
bugzilla@watkins-home.com 

I have a test array with 8 disks, /dev/ram[0-7]

I ran this command:

# mdadm /dev/md3 -f /dev/ram0
mdadm: set /dev/ram0 faulty in /dev/md3

I waited, I got 1 email:
Fail /dev/md3 /dev/ram0

I ran these 2 commands:
# mdadm /dev/md3 -a /dev/ram8
mdadm: hot added /dev/ram8
# mdadm /dev/md3 -a /dev/ram9
mdadm: hot added /dev/ram9

Now I have 2 spares.
I waited, I got this email:
SpareActive /dev/md3 /dev/ram8

The Fail event and 4 others were missed.
Examples from a slower array:
$1=Fail $2=/dev/md2 $3=/dev/sdq1 $4=
$1=Rebuild20 $2=/dev/md2 $3= $4=
$1=Rebuild40 $2=/dev/md2 $3= $4=
$1=Rebuild60 $2=/dev/md2 $3= $4=
$1=Rebuild80 $2=/dev/md2 $3= $4=
$1=SpareActive $2=/dev/md2 $3=/dev/sdc1 $4=

I ran this command:
# mdadm /dev/md3 -f /dev/ram1
mdadm: set /dev/ram1 faulty in /dev/md3

No emails were generated.  About 6 events were missed.
The Fail and SpareActive events were missed, and the 4 Rebuild events.

I think, since the state changed, then changed back, within 60 seconds, the
events were missed.

For me, I don't recall ever missing an event on a "real" array, but with the
faster disks and very small /boot partitions I believe it could easily
happen.  My small partitions don't have spares.

Also, adds and removes don't generate events.

Also, if there is no spare, the console display an extra warning:
"md3: no spare disk to reconstruct array! -- continuing in degraded mode"
Maybe this event should also generate an email.

If there is a spare, the console displays this message:
"md3: resyncing spare disk [dev 01:0e] to replace failed disk"

Maybe both of the above should generate emails.  Otherwise you must wait
until the Rebuild20 event to know that there is a spare.  Or I wait forever
if there is not a spare.

Just noticed while playing!
If I use MAILADDR and don't use PROGRAM, like this:
MAILADDR bugzilla@watkins-home.com
# PROGRAM /root/bin/handle-mdadm-events

I don't get Fail events, but I do get some events, like SpareActive.
No!  Another test I got the Fail event, but not the SpareActive.
With the above I did wait 60 seconds or more!

And when I start monitor mode using PROGRAM I get these:
$1=SparesMissing $2=/dev/md2 $3= $4=
$1=SparesMissing $2=/dev/md3 $3= $4=
$1=SparesMissing $2=/dev/md1 $3= $4=
$1=SparesMissing $2=/dev/md0 $3= $4=

But when using MAILADDR I don't get them!
And they are wrong!  /dev/md2 does have a spare, and sometimes md3 has one.

Also, if I use both, PROGRAM and MAILADDR I get some events from MAILADDR
and some from PROGRAM, I don't always get all events from both.  I have not
tried this much, so no details.

Maybe md could save events in a queue, and mdadm --monitor could access the
queue.  Maybe something like /proc/mdevents could be usefull.

I am using kernel 2.4.28 and mdadm 1.8.0.

Guy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Minor bugs in "mdadm --monitor --scan &"
  2005-01-10  5:27   ` Minor bugs in "mdadm --monitor --scan &" Guy
@ 2005-01-10  7:00     ` Guy
  0 siblings, 0 replies; 5+ messages in thread
From: Guy @ 2005-01-10  7:00 UTC (permalink / raw)
  To: linux-raid

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Guy
Sent: Monday, January 10, 2005 12:28 AM
To: linux-raid@vger.kernel.org
Subject: Minor bugs in "mdadm --monitor --scan &"

I have mdadm configured to run a script when an event occurs.

I start mdadm like this:
mdadm --monitor --scan&

That is from a script in /etc/init.d

My /etc/mdadm.conf file has this:
PROGRAM /root/bin/handle-mdadm-events
Other lines not related.

The script has these 2 lines:
echo '$1'=$1 '$2'=$2 '$3'=$3 '$4'=$4 >> /root/bin/handle-mdadm-events.log
(date;cat /proc/mdstat;mdadm --detail $2)|mail -s "md event: $1 $2 $3"
bugzilla@watkins-home.com 

I have a test array with 8 disks, /dev/ram[0-7]

I ran this command:

# mdadm /dev/md3 -f /dev/ram0
mdadm: set /dev/ram0 faulty in /dev/md3

I waited, I got 1 email:
Fail /dev/md3 /dev/ram0

I ran these 2 commands:
# mdadm /dev/md3 -a /dev/ram8
mdadm: hot added /dev/ram8
# mdadm /dev/md3 -a /dev/ram9
mdadm: hot added /dev/ram9

Now I have 2 spares.
I waited, I got this email:
SpareActive /dev/md3 /dev/ram8

The Fail event and 4 others were missed.
Examples from a slower array:
$1=Fail $2=/dev/md2 $3=/dev/sdq1 $4=
$1=Rebuild20 $2=/dev/md2 $3= $4=
$1=Rebuild40 $2=/dev/md2 $3= $4=
$1=Rebuild60 $2=/dev/md2 $3= $4=
$1=Rebuild80 $2=/dev/md2 $3= $4=
$1=SpareActive $2=/dev/md2 $3=/dev/sdc1 $4=

I ran this command:
# mdadm /dev/md3 -f /dev/ram1
mdadm: set /dev/ram1 faulty in /dev/md3

No emails were generated.  About 6 events were missed.
The Fail and SpareActive events were missed, and the 4 Rebuild events.

I think, since the state changed, then changed back, within 60 seconds, the
events were missed.

For me, I don't recall ever missing an event on a "real" array, but with the
faster disks and very small /boot partitions I believe it could easily
happen.  My small partitions don't have spares.

Also, adds and removes don't generate events.

Also, if there is no spare, the console display an extra warning:
"md3: no spare disk to reconstruct array! -- continuing in degraded mode"
Maybe this event should also generate an email.

If there is a spare, the console displays this message:
"md3: resyncing spare disk [dev 01:0e] to replace failed disk"

Maybe both of the above should generate emails.  Otherwise you must wait
until the Rebuild20 event to know that there is a spare.  Or I wait forever
if there is not a spare.

Just noticed while playing!
If I use MAILADDR and don't use PROGRAM, like this:
MAILADDR bugzilla@watkins-home.com
# PROGRAM /root/bin/handle-mdadm-events

I don't get Fail events, but I do get some events, like SpareActive.
No!  Another test I got the Fail event, but not the SpareActive.
With the above I did wait 60 seconds or more!

And when I start monitor mode using PROGRAM I get these:
$1=SparesMissing $2=/dev/md2 $3= $4=
$1=SparesMissing $2=/dev/md3 $3= $4=
$1=SparesMissing $2=/dev/md1 $3= $4=
$1=SparesMissing $2=/dev/md0 $3= $4=

But when using MAILADDR I don't get them!
And they are wrong!  /dev/md2 does have a spare, and sometimes md3 has one.

Also, if I use both, PROGRAM and MAILADDR I get some events from MAILADDR
and some from PROGRAM, I don't always get all events from both.  I have not
tried this much, so no details.

Maybe md could save events in a queue, and mdadm --monitor could access the
queue.  Maybe something like /proc/mdevents could be usefull.

I am using kernel 2.4.28 and mdadm 1.8.0.

Guy

=========================================================================
I checked my email, I got this wrong:

Guy said:
"If I use MAILADDR and don't use PROGRAM, like this:
MAILADDR bugzilla@watkins-home.com
# PROGRAM /root/bin/handle-mdadm-events

I don't get Fail events, but I do get some events, like SpareActive.
No!  Another test I got the Fail event, but not the SpareActive.
With the above I did wait 60 seconds or more!"

I do get Fail events, but do not get SpareActive.
Both tests were the same.  Not sure how I got that wrong.

Guy
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: raid 5 created with 7 out of 8
       [not found]   ` <f4146e7c05011001554b39fd9f@mail.gmail.com>
@ 2005-01-10  9:58     ` Bjørn Eikeland
  0 siblings, 0 replies; 5+ messages in thread
From: Bjørn Eikeland @ 2005-01-10  9:58 UTC (permalink / raw)
  To: linux-raid

Hi, I dont know whats up with the "VERS = 9000" output, but its
definitively from mdadm, as it's been present at every attempt at
creating the array.

However, I've upgraded to the 2.4.28 kernel and that made no
difference, but issuing mdadm -Cf /dev/md0 -l5 -n8 -c256 /dev/hd[e-l]1
(note the "f" made all my problems go away, the array is now fully
operational)

And for the stripe size, thats just an performance experiment, seems
like 256 was the winner in the end though. I can try to recreate the
problem later today if the output of mdadm -D /dev/md0 is of any
interest in finding out if this is a bug or just me. (It should also
be mentioned that the older version of mdadm that shipped with
slackware 10 also produced the same problem)

-Bjorn

On Sun, 9 Jan 2005 22:24:27 -0500, Guy <bugzilla@watkins-home.com> wrote:
> From 1.8.1:
> This is a "development" release of mdadm.  It should *not* be
> considered stable and should be used primarily for testing.
> The current "stable" version is 1.8.0.
>
> Your email shows "VERS = 9000".  Was that a command line option?  Or output
> from mdadm?
>
> The only other odd thing I see...  You have the largest chunk size I have
> seen (-c512).  But I don't know of any limits.
>
> I did create an array with this command line.  No problems.
> mdadm -C /dev/md3 -l5 -n8 -c512 /dev/ram[0-7]
>
> from cat /proc/mdstat:
> md3 : active raid5 [dev 01:07][7] [dev 01:06][6] [dev 01:05][5] [dev
> 01:04][4] [dev 01:03][3] [dev 01:02][2] [dev 01:01][1] [dev 01:00][0]
>      25088 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
>
> Send output of:
> mdadm -D /dev/md0
>
> I am using mdadm V1.8.0 and kernel 2.4.28.
>
> Guy
>
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org
> [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Bjørn Eikeland
> Sent: Sunday, January 09, 2005 8:39 PM
> To: linux-raid@vger.kernel.org
> Subject: raid 5 created with 7 out of 8
>
> Hi, I'm trying to set up a raid5 array using 8 ide drives (
> /dev/hd[e-l] ) but I'm having a hard time.
>
> I'm using slackware 10, kernel 2.4.26 and madm 1.8.1 (downloading
> 2.4.28 overnight now)
>
> The problem is mdadm creates the array with 7 of 8 drives up and
> running and the last as a spare and does not start recovering with the
> spare. And it will not let me remove it and re-add it. Below follows a
> script output of the whole thing (less repartitioning the drives and
> zero'ing any remaining superblocks)
>
> Any help will be greatly appreciated.
> -thanks
>
> root@filebear:~# mdadm -C /dev/md0 -l5 -n8 -c512 /dev/hd[e-l]1
> VERS = 9000
> mdadm: array /dev/md0 started.
> root@filebear:~# cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid5]
> read_ahead 1024 sectors
> md0 : active raid5 hdl1[8] hdk1[6] hdj1[5] hdi1[4] hdh1[3] hdg1[2]
> hdf1[1] hde1[0]
>      1094017792 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
>
> unused devices: <none>
> root@filebear:~# mdadm /dev/md0 -f /dev/hdl1
> mdadm: set /dev/hdl1 faulty in /dev/md0
> root@filebear:~# mdadm /dev/md0 -r /dev/hdl
> mdadm: hot removed /dev/hdl1
> root@filebear:~# mdadm /dev/md0 -a /dev/hdl1
> mdadm: hot add failed for /dev/hdl1: No space left on device
> root@filebear:~# mdadm /dev/md0 -f /dev/hde1
> mdadm: set /dev/hde1 faulty in /dev/md0
> root@filebear:~# mdadm /dev/md0 -r /dev/hde1
> mdadm: hot removed /dev/hde1
> root@filebear:~# mdadm /dev/md0 -a /dev/hde1
> mdadm: hot add failed for /dev/hde1: No space left on device
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-01-10  9:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-10  1:38 raid 5 created with 7 out of 8 Bjørn Eikeland
2005-01-10  3:24 ` Guy
2005-01-10  5:27   ` Minor bugs in "mdadm --monitor --scan &" Guy
2005-01-10  7:00     ` Guy
     [not found]   ` <f4146e7c05011001554b39fd9f@mail.gmail.com>
2005-01-10  9:58     ` raid 5 created with 7 out of 8 Bjørn Eikeland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).