Re: Spare drive won't spin down

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Neil Brown <neilb@suse.de>
Cc: Bill Davidsen <davidsen@tmr.com>,
	Joe Bryant <tenminjoe@yahoo.com>,
	linux-raid@vger.kernel.org
Subject: Re: Spare drive won't spin down
Date: Tue, 18 May 2010 10:20:17 +1000	[thread overview]
Message-ID: <20100518102017.51fef799@notabene.brown> (raw)
In-Reply-To: <20100512065318.44e934d4@notabene.brown>

On Wed, 12 May 2010 06:53:18 +1000
Neil Brown <neilb@suse.de> wrote:

> Theoretically, when the spares are one behind the active array and we need to
> update them all, we should update the spares first, then the rest.  If we
> don't and there is a crash at the wrong time, some spares could be 2 events
> behind the most recent device.  However that is a fairly unlikely race to
> lose and the cost is only having a spare device fall out of the array, which
> is quite easy to put back it, that I might not worry to much about it.
> 
> So if you haven't seen a patch to fix this in a week or two, please remind me.
> 

This is the sort of thing I was thinking of.
Comments?

Thanks,
NeilBrown

From bf7399c0f1e95e8af30f93114f96fcc73cb0d7c6 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Tue, 18 May 2010 09:28:43 +1000
Subject: [PATCH] md: simplify updating of event count to sometimes avoid updating spares.

When updating the event count for a simple clean <-> dirty transition,
we try to avoid updating the spares so they can safely spin-down.
As the event_counts across an array must be +/- 1, this means
decrementing the event_count on a dirty->clean transition.
This is not always safe and we have to avoid the unsafe time.
We current do this with a misguided idea about it being safe or
not depending on whether the event_count is odd or even.  This
approach only works reliably in a few common instances, but easily
falls down.

So instead, simply keep internal state concerning whether it is safe
or not, and always assume it is not safe when an array is first
assembled.

Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/md.c b/drivers/md/md.c
index fec4abc..9ef21d9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2088,7 +2088,6 @@ static void sync_sbs(mddev_t * mddev, int nospares)
 		if (rdev->sb_events == mddev->events ||
 		    (nospares &&
 		     rdev->raid_disk < 0 &&
-		     (rdev->sb_events&1)==0 &&
 		     rdev->sb_events+1 == mddev->events)) {
 			/* Don't update this superblock */
 			rdev->sb_loaded = 2;
@@ -2141,28 +2140,14 @@ repeat:
 	 * and 'events' is odd, we can roll back to the previous clean state */
 	if (nospares
 	    && (mddev->in_sync && mddev->recovery_cp == MaxSector)
-	    && (mddev->events & 1)
-	    && mddev->events != 1)
+	    && mddev->can_decrease_events
+	    && mddev->events != 1) {
 		mddev->events--;
-	else {
+		mddev->can_decrease_events = 0;
+	} else {
 		/* otherwise we have to go forward and ... */
 		mddev->events ++;
-		if (!mddev->in_sync || mddev->recovery_cp != MaxSector) { /* not clean */
-			/* .. if the array isn't clean, an 'even' event must also go
-			 * to spares. */
-			if ((mddev->events&1)==0) {
-				nospares = 0;
-				sync_req = 2; /* force a second update to get the
-					       * even/odd in sync */
-			}
-		} else {
-			/* otherwise an 'odd' event must go to spares */
-			if ((mddev->events&1)) {
-				nospares = 0;
-				sync_req = 2; /* force a second update to get the
-					       * even/odd in sync */
-			}
-		}
+		mddev->can_decrease_events = nospares;
 	}
 
 	if (!mddev->events) {
@@ -4606,6 +4591,7 @@ static void md_clean(mddev_t *mddev)
 	mddev->layout = 0;
 	mddev->max_disks = 0;
 	mddev->events = 0;
+	mddev->can_decrease_events = 0;
 	mddev->delta_disks = 0;
 	mddev->new_level = LEVEL_NONE;
 	mddev->new_layout = 0;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index a536f54..7ab5ea1 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -150,6 +150,12 @@ struct mddev_s
 	int				external_size; /* size managed
 							* externally */
 	__u64				events;
+	/* If the last 'event' was simply a clean->dirty transition, and
+	 * we didn't write it to the spares, then it is safe and simple
+	 * to just decrement the event count on a dirty->clean transition.
+	 * So we record that possibility here.
+	 */
+	int				can_decrease_events;
 
 	char				uuid[16];

     prev parent reply	other threads:[~2010-05-18  0:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-06 17:49 Spare drive won't spin down Joe Bryant
2010-05-07  5:07 ` Michael Evans
2010-05-07  7:39   ` Joe Bryant
2010-05-07  6:20 ` Neil Brown
2010-05-07  7:40   ` Joe Bryant
2010-05-07  9:47     ` Neil Brown
2010-05-07 10:05       ` Joe Bryant
2010-05-10 16:59       ` Bill Davidsen
2010-05-11 20:53         ` Neil Brown
2010-05-17 18:11           ` Doug Ledford
2010-05-18  0:23             ` Neil Brown
2010-05-18  0:38               ` Michael Evans
2010-05-18  0:50                 ` Neil Brown
2010-05-18  0:20           ` Neil Brown [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:fec4abc dfblob:9ef21d9 dfblob:a536f54 dfblob:7ab5ea1 )
 OR (
bs:"md: simplify updating of event count to sometimes avoid updating spares." )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100518102017.51fef799@notabene.brown \
    --to=neilb@suse.de \
    --cc=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=tenminjoe@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.