linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mdadm monitor spins with start-failed raid0
@ 2010-06-28  6:34 Jeff DeFouw
  2010-06-29  6:44 ` Neil Brown
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff DeFouw @ 2010-06-28  6:34 UTC (permalink / raw)
  To: linux-raid

mdadm --monitor --scan (--oneshot) spins indefinitely without sleeping 
when an "inactive" start-failed raid0 or linear array is found in 
/proc/mdstat.  By "start-failed" I mean something attempts to 
(automatically) assemble and start the array, but the array fails to 
start.  In my case, an old raid0 is missing a disk.  The mdstat parser 
assumes all entries have a personality string, but "inactive" arrays 
don't.

md0 : inactive sda3[0]
      2915712 blocks

The first disk (sda3[0] in this case) is copied as the level string.  
The mismatch gets the raid0/linear array into the statelist, which is 
immediately rejected by the statelist loop.  The rejection occurs 
without marking the mdstat entry as used, so the array is seen as a new 
entry again, the sleep/break is skipped, a new duplicate state is added 
to the statelist, and the loop starts again immediately.

Fixing the parser is simple, but fixing it leads to Monitor ignoring ALL 
inactive arrays discovered by mdstat.  This is because the mdstat loop 
requires a level string.  If Monitor should process mdstat-discovered 
start-failed arrays (as it currently does), then either the level will 
have to be checked using GET_ARRAY_INFO, or raid0/linear arrays will 
have to be rejected later.

This patch only shows how to fix the parser.

---
 mdstat.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mdstat.c b/mdstat.c
index 4a9f370..fdca877 100644
--- a/mdstat.c
+++ b/mdstat.c
@@ -168,9 +168,10 @@ struct mdstat_ent *mdstat_read(int hold, int start)
 			char *eq;
 			if (strcmp(w, "active")==0)
 				ent->active = 1;
-			else if (strcmp(w, "inactive")==0)
+			else if (strcmp(w, "inactive")==0) {
 				ent->active = 0;
-			else if (ent->active >=0 &&
+				in_devs = 1;
+			} else if (ent->active > 0 &&
 				 ent->level == NULL &&
 				 w[0] != '(' /*readonly*/) {
 				ent->level = strdup(w);
-- 
1.7.1

-- 
Jeff DeFouw <jeffd@i2k.com>

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: mdadm monitor spins with start-failed raid0
  2010-06-28  6:34 mdadm monitor spins with start-failed raid0 Jeff DeFouw
@ 2010-06-29  6:44 ` Neil Brown
  0 siblings, 0 replies; 2+ messages in thread
From: Neil Brown @ 2010-06-29  6:44 UTC (permalink / raw)
  To: Jeff DeFouw; +Cc: linux-raid

On Mon, 28 Jun 2010 02:34:33 -0400
Jeff DeFouw <jeffd@i2k.com> wrote:

> mdadm --monitor --scan (--oneshot) spins indefinitely without sleeping 
> when an "inactive" start-failed raid0 or linear array is found in 
> /proc/mdstat.  By "start-failed" I mean something attempts to 
> (automatically) assemble and start the array, but the array fails to 
> start.  In my case, an old raid0 is missing a disk.  The mdstat parser 
> assumes all entries have a personality string, but "inactive" arrays 
> don't.
> 
> md0 : inactive sda3[0]
>       2915712 blocks
> 
> The first disk (sda3[0] in this case) is copied as the level string.  
> The mismatch gets the raid0/linear array into the statelist, which is 
> immediately rejected by the statelist loop.  The rejection occurs 
> without marking the mdstat entry as used, so the array is seen as a new 
> entry again, the sleep/break is skipped, a new duplicate state is added 
> to the statelist, and the loop starts again immediately.
> 
> Fixing the parser is simple, but fixing it leads to Monitor ignoring ALL 
> inactive arrays discovered by mdstat.  This is because the mdstat loop 
> requires a level string.  If Monitor should process mdstat-discovered 
> start-failed arrays (as it currently does), then either the level will 
> have to be checked using GET_ARRAY_INFO, or raid0/linear arrays will 
> have to be rejected later.
> 
> This patch only shows how to fix the parser.

Hi Jeff,
 thanks for the patch.  I have queued it for the next release.

I think only the parse needs to be fixed.  mdadm has never been intended to
monitor inactive arrays because  - like linear and raid0 - nothing
interesting can happen to them.

Thanks a lot,
NeilBrown


> 
> ---
>  mdstat.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mdstat.c b/mdstat.c
> index 4a9f370..fdca877 100644
> --- a/mdstat.c
> +++ b/mdstat.c
> @@ -168,9 +168,10 @@ struct mdstat_ent *mdstat_read(int hold, int start)
>  			char *eq;
>  			if (strcmp(w, "active")==0)
>  				ent->active = 1;
> -			else if (strcmp(w, "inactive")==0)
> +			else if (strcmp(w, "inactive")==0) {
>  				ent->active = 0;
> -			else if (ent->active >=0 &&
> +				in_devs = 1;
> +			} else if (ent->active > 0 &&
>  				 ent->level == NULL &&
>  				 w[0] != '(' /*readonly*/) {
>  				ent->level = strdup(w);


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-06-29  6:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-28  6:34 mdadm monitor spins with start-failed raid0 Jeff DeFouw
2010-06-29  6:44 ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).