Inoperative array shown as "active"

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Inoperative array shown as "active"
@ 2013-09-14  5:39 Ian Pilcher
  2013-09-14  5:59 ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Ian Pilcher @ 2013-09-14  5:39 UTC (permalink / raw)
  To: linux-raid

I'm in the process of writing a program to monitor various aspects of
my NAS.  As part of this effort, I've been simulating RAID disk failures
in a VM, and I noticed something that seems very odd.

Namely, when a sufficient number of disks has been removed from a RAID-5
or RAID-6 array to make it inoperable, the array is still shown as
"active" in /proc/mdstat and "clean" in the sysfs array_state file.  For
example:

md0 : active raid5 sde[3](F) sdd[2] sdc[1](F) sdb[0]
      6286848 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2] [U_U_]

(mdadm does show the state as "clean, FAILED".)

Is this the expected behavior?

AFAICT, this means that there is no single item in either /proc/mdstat
or sysfs that indicates that an array such as the example above has
failed.  My program will have to parse the RAID level, calculated the
number of failed members (if any), and determine whether that RAID level
can survive that number of failures.  Is this correct?

Anything I'm missing?

Thanks!

-- 
========================================================================
Ian Pilcher                                         arequipeno@gmail.com
Sometimes there's nothing left to do but crash and burn...or die trying.
========================================================================

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Inoperative array shown as "active"
  2013-09-14  5:39 Inoperative array shown as "active" Ian Pilcher
@ 2013-09-14  5:59 ` NeilBrown
  2013-09-14  6:25   ` Ian Pilcher
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2013-09-14  5:59 UTC (permalink / raw)
  To: Ian Pilcher; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]

On Sat, 14 Sep 2013 00:39:20 -0500 Ian Pilcher <arequipeno@gmail.com> wrote:

> I'm in the process of writing a program to monitor various aspects of
> my NAS.  As part of this effort, I've been simulating RAID disk failures
> in a VM, and I noticed something that seems very odd.
> 
> Namely, when a sufficient number of disks has been removed from a RAID-5
> or RAID-6 array to make it inoperable, the array is still shown as
> "active" in /proc/mdstat and "clean" in the sysfs array_state file.  For
> example:
> 
> md0 : active raid5 sde[3](F) sdd[2] sdc[1](F) sdb[0]
>       6286848 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/2] [U_U_]
> 
> (mdadm does show the state as "clean, FAILED".)
> 
> Is this the expected behavior?

Yes.

> 
> AFAICT, this means that there is no single item in either /proc/mdstat
> or sysfs that indicates that an array such as the example above has
> failed.  My program will have to parse the RAID level, calculated the
> number of failed members (if any), and determine whether that RAID level
> can survive that number of failures.  Is this correct?

Yes.

> 
> Anything I'm missing?

mdadm already does this for you. "mdadm --detail /dev/md0".

NeilBrown

> 
> Thanks!
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Inoperative array shown as "active"
  2013-09-14  5:59 ` NeilBrown
@ 2013-09-14  6:25   ` Ian Pilcher
  0 siblings, 0 replies; 3+ messages in thread
From: Ian Pilcher @ 2013-09-14  6:25 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 09/14/2013 12:59 AM, NeilBrown wrote:
> On Sat, 14 Sep 2013 00:39:20 -0500 Ian Pilcher <arequipeno@gmail.com> wrote:
>> AFAICT, this means that there is no single item in either /proc/mdstat
>> or sysfs that indicates that an array such as the example above has
>> failed.  My program will have to parse the RAID level, calculated the
>> number of failed members (if any), and determine whether that RAID level
>> can survive that number of failures.  Is this correct?
> 
> Yes.
> 
>>
>> Anything I'm missing?
> 
> mdadm already does this for you. "mdadm --detail /dev/md0".
> 

Yeah, I haven't yet ruled out calling out to mdadm.  I'm already doing
that with hddtemp and smartctl.  It just seems a bit inefficient to do
so when all of the information is sitting right there in /proc/mdstat.

A quick test reveals that running "mdadm --detail /dev/md?*" takes
around 2 seconds on the NAS and produces about 20KB of output.  (I have
20 RAID devices -- hooray GPT! -- and an Atom processor.)  Hmmm.

Thanks for the very quick response!

-- 
========================================================================
Ian Pilcher                                         arequipeno@gmail.com
Sometimes there's nothing left to do but crash and burn...or die trying.
========================================================================

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-09-14  6:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-14  5:39 Inoperative array shown as "active" Ian Pilcher
2013-09-14  5:59 ` NeilBrown
2013-09-14  6:25   ` Ian Pilcher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).