All of lore.kernel.org
 help / color / mirror / Atom feed
* Preventing a RAID device from starting until all disks are ready
@ 2010-10-14 15:36 Andrew Klaassen
  2010-10-14 16:00 ` Iordan Iordanov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Klaassen @ 2010-10-14 15:36 UTC (permalink / raw)
  To: linux-raid

I'm having problems with a 56-drive fibre-channel software RAID-10 array.

During boot, mdadm starts the array before one of the two fibre-channel cards has started its disk detection.  The array comes up, but with only 28 of 56 drives, and I have to manually re-add the drives and cross my fingers that nothing will go wrong during the 10-hour rebuild.

Is there any way to tell mdadm to wait longer, or to not attempt to start the array if not all devices are present, or... (any other solution you can think of)?

I'm on Centos 5.2.

Thanks.

Andrew





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
  2010-10-14 15:36 Preventing a RAID device from starting until all disks are ready Andrew Klaassen
@ 2010-10-14 16:00 ` Iordan Iordanov
  2010-10-14 19:31   ` Andrew Klaassen
  2010-10-15  1:54   ` Neil Brown
  0 siblings, 2 replies; 7+ messages in thread
From: Iordan Iordanov @ 2010-10-14 16:00 UTC (permalink / raw)
  To: Andrew Klaassen; +Cc: linux-raid

Hi Andrew,

Andrew Klaassen wrote:
> During boot, mdadm starts the array before one of the two fibre-channel cards has started its disk detection.  The array comes up, but with only 28 of 56 drives, and I have to manually re-add the drives and cross my fingers that nothing will go wrong during the 10-hour rebuild.

Have you considered enabling a write-intent bitmap on your array? This 
way, at least your rebuild will take seconds instead of 10 hours. Write 
intent bitmap support for RAID10 was introduced in 2005, and hopefully 
CentOS 5.2 supports it.

> Is there any way to tell mdadm to wait longer, or to not attempt to start the array if not all devices are present, or... (any other solution you can think of)?

We have iscsi targets for drives in our array, and we make sure that 
we've logged into all 30 of our drives before we continue to enable 
mdadm (we literally count the number of iscsi sessions open). You can 
try counting the number of block devices present (in /dev/block) that 
match a certain pattern, or perhaps your fiber channel driver offers an 
even more convenient facility in /dev.

However, it would be great if there really was a way to tell mdadm to 
wait until the devices are ready. I'm not aware of one though.

Cheers!
Iordan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
@ 2010-10-14 17:00 Andrew Klaassen
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Klaassen @ 2010-10-14 17:00 UTC (permalink / raw)
  To: linux-raid

--- On Thu, 10/14/10, Iordan Iordanov <iordan@cdf.toronto.edu>  wrote:

> Have you considered enabling a write-intent bitmap on your
> array? This way, at least your rebuild will take seconds
> instead of 10 hours. Write intent bitmap support for RAID10
> was introduced in 2005, and hopefully CentOS 5.2 supports
> it.

I've never heard of that - sounds fantastic.  Does it have any performance penalties during heavy writes?

> We have iscsi targets for drives in our array, and we make
> sure that we've logged into all 30 of our drives before we
> continue to enable mdadm (we literally count the number of
> iscsi sessions open). You can try counting the number of
> block devices present (in /dev/block) that match a certain
> pattern, or perhaps your fiber channel driver offers an even
> more convenient facility in /dev.

Are you doing the mdadm startup in rc.local, or in the initrd, or...?

Andrew



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
  2010-10-14 16:00 ` Iordan Iordanov
@ 2010-10-14 19:31   ` Andrew Klaassen
  2010-10-15  1:54   ` Neil Brown
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Klaassen @ 2010-10-14 19:31 UTC (permalink / raw)
  To: Iordan Iordanov; +Cc: linux-raid

--- On Thu, 10/14/10, Iordan Iordanov <iordan@cdf.toronto.edu> wrote:

> We have iscsi targets for drives in our array, and we make
> sure that we've logged into all 30 of our drives before we
> continue to enable mdadm (we literally count the number of
> iscsi sessions open). You can try counting the number of
> block devices present (in /dev/block) that match a certain
> pattern, or perhaps your fiber channel driver offers an even
> more convenient facility in /dev.

It seems like the simplest way to do this would be to have two mdadm.conf files; one for the root arrays that need to come up right away, one for the FC/iSCSI arrays that need to wait.

Is either of these ideas:

 - run two "mdadm --monitor" processes simultaneously, one for each set of arrays, or

 - specify two config file arguments to "mdadm --monitor"

...possible?

Thanks.

Andrew





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
  2010-10-14 16:00 ` Iordan Iordanov
  2010-10-14 19:31   ` Andrew Klaassen
@ 2010-10-15  1:54   ` Neil Brown
  2010-10-15  8:19     ` Jon Hardcastle
  1 sibling, 1 reply; 7+ messages in thread
From: Neil Brown @ 2010-10-15  1:54 UTC (permalink / raw)
  To: Iordan Iordanov; +Cc: Andrew Klaassen, linux-raid

On Thu, 14 Oct 2010 12:00:42 -0400
Iordan Iordanov <iordan@cdf.toronto.edu> wrote:

> Hi Andrew,
> 
> Andrew Klaassen wrote:
> > During boot, mdadm starts the array before one of the two fibre-channel cards has started its disk detection.  The array comes up, but with only 28 of 56 drives, and I have to manually re-add the drives and cross my fingers that nothing will go wrong during the 10-hour rebuild.
> 
> Have you considered enabling a write-intent bitmap on your array? This 
> way, at least your rebuild will take seconds instead of 10 hours. Write 
> intent bitmap support for RAID10 was introduced in 2005, and hopefully 
> CentOS 5.2 supports it.
> 
> > Is there any way to tell mdadm to wait longer, or to not attempt to start the array if not all devices are present, or... (any other solution you can think of)?
> 
> We have iscsi targets for drives in our array, and we make sure that 
> we've logged into all 30 of our drives before we continue to enable 
> mdadm (we literally count the number of iscsi sessions open). You can 
> try counting the number of block devices present (in /dev/block) that 
> match a certain pattern, or perhaps your fiber channel driver offers an 
> even more convenient facility in /dev.
> 
> However, it would be great if there really was a way to tell mdadm to 
> wait until the devices are ready. I'm not aware of one though.
> 

Time to go back and read the mdadm man page.  From top to bottom.  Twice.

I suspect that --no-degraded is the flag you want.\

It was introduced in mdadm 2.5

There are three scenarios that could be relevant.

1/ If an array is being assembled explicitly, e.g.
   mdadm --assemble /dev/mdX .....
 then mdadm will refuse to assemble the array if any expected devices are
 missing.  You need to add "--run" to get it to start a partial array.

2/ If an array is being assembled using auto-assembly, e.g.
   mdadm --assemble --scan
 then mdadm will start partial arrays if it cannot find the missing parts
 anyway.  You can tell it not to with --no-degraded.  This flag is actually a
 misnomer.  It may well assemble a degraded array, but only if the array was
 degraded the last time it was active.

3/ If an array is being assembled used a sequence of --incremental commands,
e.g.
   mdadm --incremental /dev/first
   mdadm --incremental /dev/second
  etc

 then mdadm won't assemble the array until all expected devices have been
 found.  Using "--run" will override this so the array is assembled as soon
 as enough devices are present.  Once all possible devices have been
 presented to mdadm it "mdadm -incremental device" you can tell mdadm to
 start any arrays that haven't been started yet with
   mdadm --incremental --run

Hope that clears it up.

NeilBrown


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
  2010-10-15  1:54   ` Neil Brown
@ 2010-10-15  8:19     ` Jon Hardcastle
  0 siblings, 0 replies; 7+ messages in thread
From: Jon Hardcastle @ 2010-10-15  8:19 UTC (permalink / raw)
  To: Iordan Iordanov, Neil Brown; +Cc: Andrew Klaassen, linux-raid

> > Hi Andrew,
> > 
> > Andrew Klaassen wrote:
> > > During boot, mdadm starts the array before one of
> the two fibre-channel cards has started its disk
> detection.  The array comes up, but with only 28 of 56
> drives, and I have to manually re-add the drives and cross
> my fingers that nothing will go wrong during the 10-hour
> rebuild.
> > 
> > Have you considered enabling a write-intent bitmap on
> your array? This 
> > way, at least your rebuild will take seconds instead
> of 10 hours. Write 
> > intent bitmap support for RAID10 was introduced in
> 2005, and hopefully 
> > CentOS 5.2 supports it.
> > 
> > > Is there any way to tell mdadm to wait longer, or
> to not attempt to start the array if not all devices are
> present, or... (any other solution you can think of)?
> > 
> > We have iscsi targets for drives in our array, and we
> make sure that 
> > we've logged into all 30 of our drives before we
> continue to enable 
> > mdadm (we literally count the number of iscsi sessions
> open). You can 
> > try counting the number of block devices present (in
> /dev/block) that 
> > match a certain pattern, or perhaps your fiber channel
> driver offers an 
> > even more convenient facility in /dev.
> > 
> > However, it would be great if there really was a way
> to tell mdadm to 
> > wait until the devices are ready. I'm not aware of one
> though.
> > 
> 
> Time to go back and read the mdadm man page.  From top
> to bottom.  Twice.
> 
> I suspect that --no-degraded is the flag you want.\
> 
> It was introduced in mdadm 2.5
> 
> There are three scenarios that could be relevant.
> 
> 1/ If an array is being assembled explicitly, e.g.
>    mdadm --assemble /dev/mdX .....
>  then mdadm will refuse to assemble the array if any
> expected devices are
>  missing.  You need to add "--run" to get it to start
> a partial array.
> 
> 2/ If an array is being assembled using auto-assembly,
> e.g.
>    mdadm --assemble --scan
>  then mdadm will start partial arrays if it cannot find the
> missing parts
>  anyway.  You can tell it not to with
> --no-degraded.  This flag is actually a
>  misnomer.  It may well assemble a degraded array, but
> only if the array was
>  degraded the last time it was active.


This '--no-degraded' option sounds cool. Can you tell it to apply that logic on some arrays but not others? Like I have an OS drive that can happily come up as degraded if need be. But I also have a 7 drive data array that something the cables come adrift on when i am replacing/adding a drive and i'd rather it just not assemble.. so I can go back and check.

(sorry to steal the thread; kinda)


> 
> 3/ If an array is being assembled used a sequence of
> --incremental commands,
> e.g.
>    mdadm --incremental /dev/first
>    mdadm --incremental /dev/second
>   etc
> 
>  then mdadm won't assemble the array until all expected
> devices have been
>  found.  Using "--run" will override this so the array
> is assembled as soon
>  as enough devices are present.  Once all possible
> devices have been
>  presented to mdadm it "mdadm -incremental device" you can
> tell mdadm to
>  start any arrays that haven't been started yet with
>    mdadm --incremental --run
> 
> Hope that clears it up.
> 
> NeilBrown
> 



      
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Preventing a RAID device from starting until all disks are ready
       [not found] <505206.15544.qm@web65407.mail.ac4.yahoo.com>
@ 2010-10-18 18:00 ` Iordan Iordanov
  0 siblings, 0 replies; 7+ messages in thread
From: Iordan Iordanov @ 2010-10-18 18:00 UTC (permalink / raw)
  To: Andrew Klaassen; +Cc: linux-raid

Hi Andrew,

My apologies for the late reply, but I've been over-busy at work.

> I've never heard of that - sounds fantastic.  Does it have any performance penalties during heavy writes?

There has to be some performance penalty, since for every write on the 
array, there is an additional write in the write-intent bitmap. However, 
it can be imposed on a device other than your RAID array by keeping the 
write-intent bitmap as a file on a separate file-system. In the mdadm 
manpage, it is specified that the write-intent bitmap can either be 
"internal" - in the MD superblock, or external - in a file. We have kept 
it internal for now, since we experience significantly less writes than 
reads. However, we are keeping in mind the option to move it off to 
another file-system if this changes, or we see our write performance 
impacting our users.

> Are you doing the mdadm startup in rc.local, or in the initrd, or...?

We have disabled all of the system startup-scripts, and we have cooked 
our own startup script which does each stage of the startup. In our 
case, we need the following order of operations:

1) Start networking, and bring up a set of bonded interfaces.
2) Login over iscsi to 30 iscsi target drives.
3) Start mdadm
etc.

Cheers,
Iordan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-10-18 18:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-14 15:36 Preventing a RAID device from starting until all disks are ready Andrew Klaassen
2010-10-14 16:00 ` Iordan Iordanov
2010-10-14 19:31   ` Andrew Klaassen
2010-10-15  1:54   ` Neil Brown
2010-10-15  8:19     ` Jon Hardcastle
  -- strict thread matches above, loose matches on Subject: below --
2010-10-14 17:00 Andrew Klaassen
     [not found] <505206.15544.qm@web65407.mail.ac4.yahoo.com>
2010-10-18 18:00 ` Iordan Iordanov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.