linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 006 of 10] md: Set/get state of array via sysfs
Date: Thu, 1 Jun 2006 15:13:58 +1000	[thread overview]
Message-ID: <1060601051358.27613@suse.de> (raw)
In-Reply-To: 20060601150955.27444.patches@notabene


This allows the state of an md/array to be directly controlled
via sysfs and adds the ability to stop and array without
tearing it down.

Array states/settings:

 clear
     No devices, no size, no level
     Equivalent to STOP_ARRAY ioctl
 inactive
     May have some settings, but array is not active
        all IO results in error
     When written, doesn't tear down array, but just stops it
 suspended (not supported yet)
     All IO requests will block. The array can be reconfigured.
     Writing this, if accepted, will block until array is quiescent
 readonly
     no resync can happen.  no superblocks get written.
     write requests fail
 read-auto
     like readonly, but behaves like 'clean' on a write request.

 clean - no pending writes, but otherwise active.
     When written to inactive array, starts without resync
     If a write request arrives then
       if metadata is known, mark 'dirty' and switch to 'active'.
       if not known, block and switch to write-pending
     If written to an active array that has pending writes, then fails.
 active
     fully active: IO and resync can be happening.
     When written to inactive array, starts with resync

 write-pending (not supported yet)
     clean, but writes are blocked waiting for 'active' to be written.

 active-idle
     like active, but no writes have been seen for a while (100msec).

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./Documentation/md.txt |   39 +++++++++
 ./drivers/md/md.c      |  197 ++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 227 insertions(+), 9 deletions(-)

diff ./Documentation/md.txt~current~ ./Documentation/md.txt
--- ./Documentation/md.txt~current~	2006-06-01 15:03:28.000000000 +1000
+++ ./Documentation/md.txt	2006-06-01 15:05:29.000000000 +1000
@@ -216,6 +216,45 @@ All md devices contain:
      period as a number of seconds.  The default is 200msec (0.200).
      Writing a value of 0 disables safemode.
 
+   array_state
+     This file contains a single word which describes the current
+     state of the array.  In many cases, the state can be set by
+     writing the word for the desired state, however some states
+     cannot be explicitly set, and some transitions are not allowed.
+
+     clear
+         No devices, no size, no level
+         Writing is equivalent to STOP_ARRAY ioctl
+     inactive
+         May have some settings, but array is not active
+            all IO results in error
+         When written, doesn't tear down array, but just stops it
+     suspended (not supported yet)
+         All IO requests will block. The array can be reconfigured.
+         Writing this, if accepted, will block until array is quiessent
+     readonly
+         no resync can happen.  no superblocks get written.
+         write requests fail
+     read-auto
+         like readonly, but behaves like 'clean' on a write request.
+
+     clean - no pending writes, but otherwise active.
+         When written to inactive array, starts without resync
+         If a write request arrives then
+           if metadata is known, mark 'dirty' and switch to 'active'.
+           if not known, block and switch to write-pending
+         If written to an active array that has pending writes, then fails.
+     active
+         fully active: IO and resync can be happening.
+         When written to inactive array, starts with resync
+
+     write-pending
+         clean, but writes are blocked waiting for 'active' to be written.
+
+     active-idle
+         like active, but no writes have been seen for a while (safe_mode_delay).
+
+
    sync_speed_min
    sync_speed_max
      This are similar to /proc/sys/dev/raid/speed_limit_{min,max}

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2006-06-01 15:05:29.000000000 +1000
+++ ./drivers/md/md.c	2006-06-01 15:05:29.000000000 +1000
@@ -2185,6 +2185,176 @@ chunk_size_store(mddev_t *mddev, const c
 static struct md_sysfs_entry md_chunk_size =
 __ATTR(chunk_size, 0644, chunk_size_show, chunk_size_store);
 
+/*
+ * The array state can be:
+ *
+ * clear
+ *     No devices, no size, no level
+ *     Equivalent to STOP_ARRAY ioctl
+ * inactive
+ *     May have some settings, but array is not active
+ *        all IO results in error
+ *     When written, doesn't tear down array, but just stops it
+ * suspended (not supported yet)
+ *     All IO requests will block. The array can be reconfigured.
+ *     Writing this, if accepted, will block until array is quiessent
+ * readonly
+ *     no resync can happen.  no superblocks get written.
+ *     write requests fail
+ * read-auto
+ *     like readonly, but behaves like 'clean' on a write request.
+ *
+ * clean - no pending writes, but otherwise active.
+ *     When written to inactive array, starts without resync
+ *     If a write request arrives then
+ *       if metadata is known, mark 'dirty' and switch to 'active'.
+ *       if not known, block and switch to write-pending
+ *     If written to an active array that has pending writes, then fails.
+ * active
+ *     fully active: IO and resync can be happening.
+ *     When written to inactive array, starts with resync
+ *
+ * write-pending
+ *     clean, but writes are blocked waiting for 'active' to be written.
+ *
+ * active-idle
+ *     like active, but no writes have been seen for a while (100msec).
+ *
+ */
+enum array_state { clear, inactive, suspended, readonly, read_auto, clean, active,
+		   write_pending, active_idle, bad_word};
+char *array_states[] = {
+	"clear", "inactive", "suspended", "readonly", "read-auto", "clean", "active",
+	"write-pending", "active-idle", NULL };
+
+static int match_word(const char *word, char **list)
+{
+	int n;
+	for (n=0; list[n]; n++)
+		if (cmd_match(word, list[n]))
+			break;
+	return n;
+}
+
+static ssize_t
+array_state_show(mddev_t *mddev, char *page)
+{
+	enum array_state st = inactive;
+
+	if (mddev->pers)
+		switch(mddev->ro) {
+		case 1:
+			st = readonly;
+			break;
+		case 2:
+			st = read_auto;
+			break;
+		case 0:
+			if (mddev->in_sync)
+				st = clean;
+			else if (mddev->safemode)
+				st = active_idle;
+			else
+				st = active;
+		}
+	else {
+		if (list_empty(&mddev->disks) &&
+		    mddev->raid_disks == 0 &&
+		    mddev->size == 0)
+			st = clear;
+		else
+			st = inactive;
+	}
+	return sprintf(page, "%s\n", array_states[st]);
+}
+
+static int do_md_stop(mddev_t * mddev, int ro);
+static int do_md_run(mddev_t * mddev);
+static int restart_array(mddev_t *mddev);
+
+static ssize_t
+array_state_store(mddev_t *mddev, const char *buf, size_t len)
+{
+	int err = -EINVAL;
+	enum array_state st = match_word(buf, array_states);
+	switch(st) {
+	case bad_word:
+		break;
+	case clear:
+		/* stopping an active array */
+		if (mddev->pers) {
+			if (atomic_read(&mddev->active) > 1)
+				return -EBUSY;
+			err = do_md_stop(mddev, 0);
+		}
+		break;
+	case inactive:
+		/* stopping an active array */
+		if (mddev->pers) {
+			if (atomic_read(&mddev->active) > 1)
+				return -EBUSY;
+			err = do_md_stop(mddev, 2);
+		}
+		break;
+	case suspended:
+		break; /* not supported yet */
+	case readonly:
+		if (mddev->pers)
+			err = do_md_stop(mddev, 1);
+		else {
+			mddev->ro = 1;
+			err = do_md_run(mddev);
+		}
+		break;
+	case read_auto:
+		/* stopping an active array */
+		if (mddev->pers) {
+			err = do_md_stop(mddev, 1);
+			if (err == 0)
+				mddev->ro = 2; /* FIXME mark devices writable */
+		} else {
+			mddev->ro = 2;
+			err = do_md_run(mddev);
+		}
+		break;
+	case clean:
+		if (mddev->pers) {
+			restart_array(mddev);
+			spin_lock_irq(&mddev->write_lock);
+			if (atomic_read(&mddev->writes_pending) == 0) {
+				mddev->in_sync = 1;
+				mddev->sb_dirty = 1;
+			}
+			spin_unlock_irq(&mddev->write_lock);
+		} else {
+			mddev->ro = 0;
+			mddev->recovery_cp = MaxSector;
+			err = do_md_run(mddev);
+		}
+		break;
+	case active:
+		if (mddev->pers) {
+			restart_array(mddev);
+			mddev->sb_dirty = 0;
+			wake_up(&mddev->sb_wait);
+			err = 0;
+		} else {
+			mddev->ro = 0;
+			err = do_md_run(mddev);
+		}
+		break;
+	case write_pending:
+	case active_idle:
+		/* these cannot be set */
+		break;
+	}
+	if (err)
+		return err;
+	else
+		return len;
+}
+static struct md_sysfs_entry md_array_state = __ATTR(array_state, 0644, array_state_show, array_state_store);
+
 static ssize_t
 null_show(mddev_t *mddev, char *page)
 {
@@ -2553,6 +2723,7 @@ static struct attribute *md_default_attr
 	&md_metadata.attr,
 	&md_new_device.attr,
 	&md_safe_delay.attr,
+	&md_array_state.attr,
 	NULL,
 };
 
@@ -2919,11 +3090,8 @@ static int restart_array(mddev_t *mddev)
 		md_wakeup_thread(mddev->thread);
 		md_wakeup_thread(mddev->sync_thread);
 		err = 0;
-	} else {
-		printk(KERN_ERR "md: %s has no personality assigned.\n",
-			mdname(mddev));
+	} else
 		err = -EINVAL;
-	}
 
 out:
 	return err;
@@ -2955,7 +3123,12 @@ static void restore_bitmap_write_access(
 	spin_unlock(&inode->i_lock);
 }
 
-static int do_md_stop(mddev_t * mddev, int ro)
+/* mode:
+ *   0 - completely stop and dis-assemble array
+ *   1 - switch to readonly
+ *   2 - stop but do not disassemble array
+ */
+static int do_md_stop(mddev_t * mddev, int mode)
 {
 	int err = 0;
 	struct gendisk *disk = mddev->gendisk;
@@ -2977,12 +3150,15 @@ static int do_md_stop(mddev_t * mddev, i
 
 		invalidate_partition(disk, 0);
 
-		if (ro) {
+		switch(mode) {
+		case 1: /* readonly */
 			err  = -ENXIO;
 			if (mddev->ro==1)
 				goto out;
 			mddev->ro = 1;
-		} else {
+			break;
+		case 0: /* disassemble */
+		case 2: /* stop */
 			bitmap_flush(mddev);
 			md_super_wait(mddev);
 			if (mddev->ro)
@@ -3002,7 +3178,7 @@ static int do_md_stop(mddev_t * mddev, i
 			mddev->in_sync = 1;
 			md_update_sb(mddev);
 		}
-		if (ro)
+		if (mode == 1)
 			set_disk_ro(disk, 1);
 		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 	}
@@ -3010,7 +3186,7 @@ static int do_md_stop(mddev_t * mddev, i
 	/*
 	 * Free resources if final stop
 	 */
-	if (!ro) {
+	if (mode == 0) {
 		mdk_rdev_t *rdev;
 		struct list_head *tmp;
 		struct gendisk *disk;
@@ -3034,6 +3210,9 @@ static int do_md_stop(mddev_t * mddev, i
 		export_array(mddev);
 
 		mddev->array_size = 0;
+		mddev->size = 0;
+		mddev->raid_disks = 0;
+
 		disk = mddev->gendisk;
 		if (disk)
 			set_capacity(disk, 0);

  parent reply	other threads:[~2006-06-01  5:13 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-01  5:13 [PATCH 000 of 10] md: Introduction - assorted patches for -mm NeilBrown
2006-06-01  5:13 ` [PATCH 001 of 10] md: md Kconfig speeling feex NeilBrown
2006-06-01  5:13 ` [PATCH 002 of 10] md: Fix Kconfig error NeilBrown
2006-06-01  5:13 ` [PATCH 003 of 10] md: Fix bug that stops raid5 resync from happening NeilBrown
2006-06-01  5:13 ` [PATCH 004 of 10] md: Allow re-add to work on array without bitmaps NeilBrown
2006-06-01  5:13 ` [PATCH 005 of 10] md: Don't write dirty/clean update to spares - leave them alone NeilBrown
2006-06-01  5:13 ` NeilBrown [this message]
2006-06-01  5:33   ` [PATCH 006 of 10] md: Set/get state of array via sysfs Chris Wright
2006-06-01  5:43     ` Neil Brown
2006-06-01  5:14 ` [PATCH 007 of 10] md: Allow rdev state to be set " NeilBrown
2006-06-01  5:14 ` [PATCH 008 of 10] md: Allow raid 'layout' to be read and " NeilBrown
2006-06-01  5:34   ` Chris Wright
2006-06-01  5:44     ` Neil Brown
2006-06-01  5:14 ` [PATCH 009 of 10] md: Allow resync_start to be set and queried " NeilBrown
2006-06-01  5:14 ` [PATCH 010 of 10] md: Allow the write_mostly flag to be set " NeilBrown
2006-08-05  5:59   ` Mike Snitzer
2006-08-05 23:43     ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1060601051358.27613@suse.de \
    --to=neilb@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).