failing a drive while RAID5 is initializing

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* failing a drive while RAID5 is initializing
@ 2011-02-23 16:20 Iordan Iordanov
  2011-02-23 16:33 ` Robin Hill
  0 siblings, 1 reply; 4+ messages in thread
From: Iordan Iordanov @ 2011-02-23 16:20 UTC (permalink / raw)
  To: linux-raid

Hi guys,

I just wanted to make sure that the behaviour I observed is as expected. 
With kernel 2.6.35.11, under Debian Lenny, I created a RAID5 array with 
5 drives, partitioned it, formatted a partition with ext3 and mounted 
it. Then, I put some load onto the filesystem with:

dd if=/dev/urandom of=/mnt/testfile

The array started initializing. At that point, I needed to fail and 
replace a drive for some unrelated testing, so I did that with:

mdadm /dev/md0 -f /dev/sdc

The result was a broken filesystem which was remounted read-only, and a 
bunch of errors in dmesg. Theoretically, one would imagine that failing 
a drive on a RAID5 even during initialization should render the array 
without redundancy but workable. Am I wrong? Is there something special 
about the initialization stage of RAID5 that makes drive failure fatal 
during the initialization? If not, then I have a bug to report and I'll 
try to reproduce it for you.

If initialisation is special, does that mean that when creating RAID5, 
it is advisable to *wait* until the array has fully initialized before 
using it, otherwise one risks losing any data that was put onto the 
array during the initialization phase if a drive fails at that point?

Many thanks for any input,
Iordan Iordanov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: failing a drive while RAID5 is initializing
  2011-02-23 16:20 failing a drive while RAID5 is initializing Iordan Iordanov
@ 2011-02-23 16:33 ` Robin Hill
  2011-02-23 22:08   ` Iordan Iordanov
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Hill @ 2011-02-23 16:33 UTC (permalink / raw)
  To: Iordan Iordanov; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2530 bytes --]

On Wed Feb 23, 2011 at 11:20:34AM -0500, Iordan Iordanov wrote:

> Hi guys,
> 
> I just wanted to make sure that the behaviour I observed is as expected. 
> With kernel 2.6.35.11, under Debian Lenny, I created a RAID5 array with 
> 5 drives, partitioned it, formatted a partition with ext3 and mounted 
> it. Then, I put some load onto the filesystem with:
> 
> dd if=/dev/urandom of=/mnt/testfile
> 
> The array started initializing. At that point, I needed to fail and 
> replace a drive for some unrelated testing, so I did that with:
> 
> mdadm /dev/md0 -f /dev/sdc
> 
> The result was a broken filesystem which was remounted read-only, and a 
> bunch of errors in dmesg. Theoretically, one would imagine that failing 
> a drive on a RAID5 even during initialization should render the array 
> without redundancy but workable. Am I wrong? Is there something special 
> about the initialization stage of RAID5 that makes drive failure fatal 
> during the initialization? If not, then I have a bug to report and I'll 
> try to reproduce it for you.
> 
The array is created in a degraded state, then recovered onto the final
disk.  Pulling any disk bar the final one will results in a broken array
and lost data, until this recovery is completed.

> If initialisation is special, does that mean that when creating RAID5, 
> it is advisable to *wait* until the array has fully initialized before 
> using it, otherwise one risks losing any data that was put onto the 
> array during the initialization phase if a drive fails at that point?
> 
That depends.  It's advisable not to use it for critical, non-backed up
data (the same as it is during a recovery following a drive failure).
The alternative is to wait until the array is fully initialised before
making it available to the user (which could take a considerable amount
of time) or manually zeroing all the drives and then creating the array
using --assume-clean (in which case the zeroing means the parity data is
correct by default).

Some of the features on Neil's current roadmap should allow for "lazy
initialisation" where the recovery data is added only as the drive is
written to, which should mean the array is available immediately but
still retains full recoverability.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: failing a drive while RAID5 is initializing
  2011-02-23 16:33 ` Robin Hill
@ 2011-02-23 22:08   ` Iordan Iordanov
  2011-02-23 22:33     ` Robin Hill
  0 siblings, 1 reply; 4+ messages in thread
From: Iordan Iordanov @ 2011-02-23 22:08 UTC (permalink / raw)
  To: linux-raid

Thanks for the clear answer. I am glad I didn't uncover a bug with 
something so basic.

> Some of the features on Neil's current roadmap should allow for "lazy
> initialisation" where the recovery data is added only as the drive is
> written to, which should mean the array is available immediately but
> still retains full recoverability.

I was thinking that this is how things work now, but now that you 
mentioned it, the man-page does say that the array is degraded to begin 
with and is rebuilding onto the last device during initialization.

To make this lazy initialization possible, one would have to have a 
bitmap of "dirtied" chunks, though, right?

Cheers,
Iordan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: failing a drive while RAID5 is initializing
  2011-02-23 22:08   ` Iordan Iordanov
@ 2011-02-23 22:33     ` Robin Hill
  0 siblings, 0 replies; 4+ messages in thread
From: Robin Hill @ 2011-02-23 22:33 UTC (permalink / raw)
  To: Iordan Iordanov; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

On Wed Feb 23, 2011 at 05:08:48PM -0500, Iordan Iordanov wrote:

> > Some of the features on Neil's current roadmap should allow for "lazy
> > initialisation" where the recovery data is added only as the drive is
> > written to, which should mean the array is available immediately but
> > still retains full recoverability.
> 
> I was thinking that this is how things work now, but now that you 
> mentioned it, the man-page does say that the array is degraded to begin 
> with and is rebuilding onto the last device during initialization.
> 
> To make this lazy initialization possible, one would have to have a 
> bitmap of "dirtied" chunks, though, right?
> 
Yes, that's correct.  He's also suggested using this same bitmap with
TRIM operations for SSDs - I'm not sure whether this is mainly intended
for recognising when an entire block in unsynced and can be trimmed, or
whether it's primarily to allow TRIM operations to be delayed until the
system is less busy (as they require a flush of the I/O buffer and
command queue).

Anyway, I'd suggest reading the roadmap as it goes into a lot more
detail on the planned implementation (it was posted here a week or so
ago, and is also on his blog at
http://neil.brown.name/blog/20090129234603).

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-02-23 22:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-23 16:20 failing a drive while RAID5 is initializing Iordan Iordanov
2011-02-23 16:33 ` Robin Hill
2011-02-23 22:08   ` Iordan Iordanov
2011-02-23 22:33     ` Robin Hill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).