[PATCH] FIX: Cannot remove failed disk from container

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] FIX: Cannot remove failed disk from container
@ 2011-12-29 13:27 Adam Kwolek
  2012-01-02 23:31 ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Adam Kwolek @ 2011-12-29 13:27 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams

When disk is failed by mdadm e.g.:
    mdadm -f /dev/md/raid_array /dev/sdX
and then it is tried to be removed from container e.g.:
    mdadm --remove /dev/md/container /dev/sdX

mdadm refuses it with information:
    mdadm: /dev/sdX is still in use, cannot remove.

Problem was introduced in commit:
    monitor: don't unblock a device that isn't blocked.
    /2011-12-06/
Disk without unblocking it cannot be really removed from array
and reference to if is still reported under 'holders' sysfs entry.

As this commit is necessary for managing degraded array during
reshape and rebuild code for unconditional unblocking disk on removal
is added.
Guard for setting DS_UNBLOCK during reshape/rebuild avoids process
performance degradation.

Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---

 monitor.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/monitor.c b/monitor.c
index 29bde18..ce66625 100644
--- a/monitor.c
+++ b/monitor.c
@@ -346,6 +346,12 @@ static int read_and_act(struct active_array *a)
 				a->next_state = active;
 			}
 			if (a->curr_state > readonly)
+				/* For disk removal when no md process
+				 * /e.g. reshape/ is in progress, trigger
+				 * unconditional device unblock first
+				 */
+				if (a->curr_action == idle)
+					mdi->next_state |= DS_UNBLOCK;
 				mdi->next_state |= DS_REMOVE;
 		}
 	}


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] FIX: Cannot remove failed disk from container
  2011-12-29 13:27 [PATCH] FIX: Cannot remove failed disk from container Adam Kwolek
@ 2012-01-02 23:31 ` NeilBrown
  2012-01-03  0:40   ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2012-01-02 23:31 UTC (permalink / raw)
  To: Adam Kwolek; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams

[-- Attachment #1: Type: text/plain, Size: 2165 bytes --]

On Thu, 29 Dec 2011 14:27:38 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:

> When disk is failed by mdadm e.g.:
>     mdadm -f /dev/md/raid_array /dev/sdX
> and then it is tried to be removed from container e.g.:
>     mdadm --remove /dev/md/container /dev/sdX
> 
> mdadm refuses it with information:
>     mdadm: /dev/sdX is still in use, cannot remove.
> 
> Problem was introduced in commit:
>     monitor: don't unblock a device that isn't blocked.
>     /2011-12-06/
> Disk without unblocking it cannot be really removed from array
> and reference to if is still reported under 'holders' sysfs entry.
> 
> As this commit is necessary for managing degraded array during
> reshape and rebuild code for unconditional unblocking disk on removal
> is added.
> Guard for setting DS_UNBLOCK during reshape/rebuild avoids process
> performance degradation.

You seem to be addressing the symptom rather than understanding the real
problem.

If a device isn't marked as 'blocked' then it simply doesn't make any sense
to "unblock" it - that cannot do anything useful.

If the commit you identify broke things for you, then we need to understand
exactly why.  What exactly is the problem, how was the code previously
allowing things to work?  What is the minimal thing we need to do to allow
things to work again?

Just setting DS_UNBLOCK because it seems to work but without a clear
justification isn't acceptable.

NeilBrown


> 
> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
> ---
> 
>  monitor.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/monitor.c b/monitor.c
> index 29bde18..ce66625 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -346,6 +346,12 @@ static int read_and_act(struct active_array *a)
>  				a->next_state = active;
>  			}
>  			if (a->curr_state > readonly)
> +				/* For disk removal when no md process
> +				 * /e.g. reshape/ is in progress, trigger
> +				 * unconditional device unblock first
> +				 */
> +				if (a->curr_action == idle)
> +					mdi->next_state |= DS_UNBLOCK;
>  				mdi->next_state |= DS_REMOVE;
>  		}
>  	}


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] FIX: Cannot remove failed disk from container
  2012-01-02 23:31 ` NeilBrown
@ 2012-01-03  0:40   ` NeilBrown
  0 siblings, 0 replies; 3+ messages in thread
From: NeilBrown @ 2012-01-03  0:40 UTC (permalink / raw)
  To: Adam Kwolek; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams

[-- Attachment #1: Type: text/plain, Size: 7978 bytes --]

On Tue, 3 Jan 2012 10:31:04 +1100 NeilBrown <neilb@suse.de> wrote:

> On Thu, 29 Dec 2011 14:27:38 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:
> 
> > When disk is failed by mdadm e.g.:
> >     mdadm -f /dev/md/raid_array /dev/sdX
> > and then it is tried to be removed from container e.g.:
> >     mdadm --remove /dev/md/container /dev/sdX
> > 
> > mdadm refuses it with information:
> >     mdadm: /dev/sdX is still in use, cannot remove.
> > 
> > Problem was introduced in commit:
> >     monitor: don't unblock a device that isn't blocked.
> >     /2011-12-06/
> > Disk without unblocking it cannot be really removed from array
> > and reference to if is still reported under 'holders' sysfs entry.
> > 
> > As this commit is necessary for managing degraded array during
> > reshape and rebuild code for unconditional unblocking disk on removal
> > is added.
> > Guard for setting DS_UNBLOCK during reshape/rebuild avoids process
> > performance degradation.
> 
> You seem to be addressing the symptom rather than understanding the real
> problem.
> 
> If a device isn't marked as 'blocked' then it simply doesn't make any sense
> to "unblock" it - that cannot do anything useful.
> 
> If the commit you identify broke things for you, then we need to understand
> exactly why.  What exactly is the problem, how was the code previously
> allowing things to work?  What is the minimal thing we need to do to allow
> things to work again?
> 
> Just setting DS_UNBLOCK because it seems to work but without a clear
> justification isn't acceptable.
> 

The problem was that when were try to "remove" a device that can fail and the
comment said we assume another event would happen soon which would trigger a
second attempt to remove the device.  That apparently isn't true.

Your patch that caused it to "unblock" again only worked because that
triggered another event immediately - so you were relying on a side-effect.

I have committed the following two patches which fix the problem 'properly'.

Thanks,
NeilBrown

From 77b3ac8c6521d836dd3c6ef247c252293920fd52 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Tue, 3 Jan 2012 11:18:59 +1100
Subject: [PATCH 1/2] monitor: make return from read_and_act more symbolic.

Rather than just a number, use a named flag.

This makes the code easier to understand and allows room for returning
more flags later.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 monitor.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/monitor.c b/monitor.c
index 29bde18..cfe4178 100644
--- a/monitor.c
+++ b/monitor.c
@@ -211,6 +211,7 @@ static void signal_manager(void)
  *
  */
 
+#define ARRAY_DIRTY 1
 static int read_and_act(struct active_array *a)
 {
 	unsigned long long sync_completed;
@@ -218,7 +219,7 @@ static int read_and_act(struct active_array *a)
 	int check_reshape = 0;
 	int deactivate = 0;
 	struct mdinfo *mdi;
-	int dirty = 0;
+	int ret = 0;
 	int count = 0;
 
 	a->next_state = bad_word;
@@ -254,14 +255,14 @@ static int read_and_act(struct active_array *a)
 	if (a->curr_state == write_pending) {
 		a->container->ss->set_array_state(a, 0);
 		a->next_state = active;
-		dirty = 1;
+		ret |= ARRAY_DIRTY;
 	}
 	if (a->curr_state == active_idle) {
 		/* Set array to 'clean' FIRST, then mark clean
 		 * in the metadata
 		 */
 		a->next_state = clean;
-		dirty = 1;
+		ret |= ARRAY_DIRTY;
 	}
 	if (a->curr_state == clean) {
 		a->container->ss->set_array_state(a, 1);
@@ -269,7 +270,7 @@ static int read_and_act(struct active_array *a)
 	if (a->curr_state == active ||
 	    a->curr_state == suspended ||
 	    a->curr_state == bad_word)
-		dirty = 1;
+		ret |= ARRAY_DIRTY;
 	if (a->curr_state == readonly) {
 		/* Well, I'm ready to handle things.  If readonly
 		 * wasn't requested, transition to read-auto.
@@ -284,7 +285,7 @@ static int read_and_act(struct active_array *a)
 				a->next_state = read_auto; /* array is clean */
 			else {
 				a->next_state = active; /* Now active for recovery etc */
-				dirty = 1;
+				ret |= ARRAY_DIRTY;
 			}
 		}
 	}
@@ -459,7 +460,7 @@ static int read_and_act(struct active_array *a)
 	if (deactivate)
 		a->container = NULL;
 
-	return dirty;
+	return ret;
 }
 
 static struct mdinfo *
@@ -629,7 +630,6 @@ static int wait_and_act(struct supertype *container, int nowait)
 	rv = 0;
 	dirty_arrays = 0;
 	for (a = *aap; a ; a = a->next) {
-		int is_dirty;
 
 		if (a->replaces && !discard_this) {
 			struct active_array **ap;
@@ -644,14 +644,14 @@ static int wait_and_act(struct supertype *container, int nowait)
 			signal_manager();
 		}
 		if (a->container && !a->to_remove) {
-			is_dirty = read_and_act(a);
+			int ret = read_and_act(a);
 			rv |= 1;
-			dirty_arrays += is_dirty;
+			dirty_arrays += !!(ret & ARRAY_DIRTY);
 			/* when terminating stop manipulating the array after it
 			 * is clean, but make sure read_and_act() is given a
 			 * chance to handle 'active_idle'
 			 */
-			if (sigterm && !is_dirty)
+			if (sigterm && !(ret & ARRAY_DIRTY))
 				a->container = NULL; /* stop touching this array */
 		}
 	}
-- 
1.7.7


From 68226a80812cd9fce63dbd14d2225ffdf16a781b Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Tue, 3 Jan 2012 00:36:23 +1100
Subject: [PATCH 2/2] monitor: ensure we retry soon when 'remove' fails.

If a 'remove' fails there is no certainty that another event will
happen soon, so make sure we retry soon anyway.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
---
 mdadm.h   |    1 +
 monitor.c |   16 ++++++++++------
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/mdadm.h b/mdadm.h
index 3bcd052..381ef86 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -867,6 +867,7 @@ struct supertype {
 			*  external:/md0/12
 			*/
 	int devcnt;
+	int retry_soon;
 
 	struct mdinfo *devs;
 
diff --git a/monitor.c b/monitor.c
index cfe4178..c987d10 100644
--- a/monitor.c
+++ b/monitor.c
@@ -212,6 +212,7 @@ static void signal_manager(void)
  */
 
 #define ARRAY_DIRTY 1
+#define ARRAY_BUSY 2
 static int read_and_act(struct active_array *a)
 {
 	unsigned long long sync_completed;
@@ -419,9 +420,9 @@ static int read_and_act(struct active_array *a)
 		if ((mdi->next_state & DS_REMOVE) && mdi->state_fd >= 0) {
 			int remove_result;
 
-			/* the kernel may not be able to immediately remove the
-			 * disk, we can simply wait until the next event to try
-			 * again.
+			/* The kernel may not be able to immediately remove the
+			 * disk.  In that case we wait a little while and
+			 * try again.
 			 */
 			remove_result = write_attr("remove", mdi->state_fd);
 			if (remove_result > 0) {
@@ -429,7 +430,8 @@ static int read_and_act(struct active_array *a)
 				close(mdi->state_fd);
 				close(mdi->recovery_fd);
 				mdi->state_fd = -1;
-			}
+			} else
+				ret |= ARRAY_BUSY;
 		}
 		if (mdi->next_state & DS_INSYNC) {
 			write_attr("+in_sync", mdi->state_fd);
@@ -597,7 +599,7 @@ static int wait_and_act(struct supertype *container, int nowait)
 		struct timespec ts;
 		ts.tv_sec = 24*3600;
 		ts.tv_nsec = 0;
-		if (*aap == NULL) {
+		if (*aap == NULL || container->retry_soon) {
 			/* just waiting to get O_EXCL access */
 			ts.tv_sec = 0;
 			ts.tv_nsec = 20000000ULL;
@@ -612,7 +614,7 @@ static int wait_and_act(struct supertype *container, int nowait)
 		#ifdef DEBUG
 		dprint_wake_reasons(&rfds);
 		#endif
-
+		container->retry_soon = 0;
 	}
 
 	if (update_queue) {
@@ -653,6 +655,8 @@ static int wait_and_act(struct supertype *container, int nowait)
 			 */
 			if (sigterm && !(ret & ARRAY_DIRTY))
 				a->container = NULL; /* stop touching this array */
+			if (ret & ARRAY_BUSY)
+				container->retry_soon = 1;
 		}
 	}
 
-- 
1.7.7




[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-01-03  0:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-29 13:27 [PATCH] FIX: Cannot remove failed disk from container Adam Kwolek
2012-01-02 23:31 ` NeilBrown
2012-01-03  0:40   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).