Re: /proc/mdstat showing a device missing from array

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: evoltech@2inches.com
Cc: linux-raid@vger.kernel.org
Subject: Re: /proc/mdstat showing a device missing from array
Date: Tue, 15 May 2007 09:35:09 +1000	[thread overview]
Message-ID: <17992.61997.544536.744615@notabene.brown> (raw)
In-Reply-To: message from evoltech@2inches.com on Monday May 14

On Monday May 14, evoltech@2inches.com wrote:
> Quoting Neil Brown <neilb@suse.de>:
> 
> > 
> > A raid5 is always created with one missing device and one spare.  This
> > is because recovery onto a spare is faster than resync of a brand new
> > array.
> 
> This is unclear to me.  Do you mean that is how mdadm implements raid5 creation
>  or do you mean that is how raid5 is designed?  I havn't read about this in any
> raid5 documentation or mdadm documentation.  Can you point me in the right
> direction?

In the OPTIONS section of mdadm.8, under "For create, build, or
grow:", it says:
       -f, --force
              Insist that mdadm accept the geometry and layout specified with‐
              out  question.   Normally  mdadm  will  not allow creation of an
              array with only one device, and will try to create a raid5 array
              with  one  missing  drive (as this makes the initial resync work
              faster).  With --force, mdadm will not try to be so clever.

This is a feature specific to the md implementation of raid5, not
necessarily general to all raid5 implementations.
When md/raid5 performs a "sync", it assumes that most parity blocks
are correct.  So it simply reads all drives in parallel and check that
the parity block is correct.  When it finds one that isn't (which
should not happen often) it will write the correct data.  This
requires a backward seek and breaks the streaming flow of data off the
drives.

For a new array, it is likely that most if not all parity blocks are
wrong.  The above algorithm will cause every parity block to be first
read, and then written, producing lots of seeks and much slower
throughput.

If you create a new array degraded and add a spare, it will recover
the spare by reading all the good drives in parallel, computing the
missing drive, and writing that purely sequentially.  That goes much
faster.

e.g. on my test machine with 5 SATA drives, I get 42Meg/sec recovery
speed, but only around 30Meg/sec resync on a new array.

> > 
> > This bug was fixed in mdadm-2.5.2
> 
> The recovery of the array has started, thanks!

Excellent.  I have since realised that there is a kernel bug as
well.  I will get that fixed in the next release so that the old mdadm
will also work properly.

NeilBrown

     prev parent reply	other threads:[~2007-05-14 23:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-13 18:33 /proc/mdstat showing a device missing from array evoltech
2007-05-14  1:07 ` Neil Brown
2007-05-14 19:42   ` evoltech
2007-05-14 23:35     ` Neil Brown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17992.61997.544536.744615@notabene.brown \
    --to=neilb@suse.de \
    --cc=evoltech@2inches.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).