[PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action
@ 2013-03-19 17:15 Jonathan Brassow
  2013-03-19 17:18 ` [PATCH 1 of 2 - v2] MD: Export 'md_reap_sync_thread' function Jonathan Brassow
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jonathan Brassow @ 2013-03-19 17:15 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, agk, jbrassow

Neil,

I've made the updates which Alasdair suggested, re-worked the patch to
apply to 3.9.0-rc3 instead of 3.8.0, and split the patch into two
pieces:
- export reap_sync_thread: to ensure completion of "idle"/"frozen"
commands
- Add message/status support for changing sync action

I still have two questions though:
1)  The mismatch_count is only valid after a "check" has been run,
right?  If I run a "repair", I would expect it to be reset upon
completion, but it is not - not until "check" has been run again.  Is
this the expected behavior?
2)  It is possible to issue "frozen" on an array that is undergoing
"resync" and then issue a "check".  I would expect an EBUSY or
something.  Additionally, checkpointing seems to be done and seems to
make it possible to e.g. "resync" the first 25%, "check" the second 25%
and then finish the last half with a "resync" again.  This would be
really stupid of the user, but should we catch it?  (This would probably
be a follow-on patch if "yes".)

Thanks,
 brassow

P.S.  Here is the expected (and tested) output of the various states if
interested:
Initial (automated) sync:
- health_chars should all be 'a'
- sync_ratio should show progress
- sync_action should be "resync"
[root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 aa 5029120/10485760 resync 0

Nominal state:
- health_chars should all be 'A'
- sync_ratio should show 100% (same numerator and denominator)
- sync_action should be "idle"
[root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 AA 10485760/10485760 idle 0

Rebuild/replace a device:
- health_chars show devices being replaced as 'a', but others as 'A'
- sync_ratio should show progress
- sync_action should be "recover"
[root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
0 10485760 raid raid1 2 aA 655488/10485760 recover 0

Check/scrub:
- health_chars should all be 'A'
- sync_ratio should show progress of "check"
- sync_action should be "check"
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 AA 1310976/10485760 check 0

Repair:
- health_chars should all be 'A'
- sync_ratio should show progress of "repair"
- sync_action should be "repair"
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 AA 655488/10485760 repair 0

Check/scrub (when devices differ):
- health_chars should all be 'A'
- sync_ratio should show progress of "check"
- sync_action should be "check"
- mismatch_cnt should show a count of discrepancies
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
0 10485760 raid raid1 2 AA 655488/10485760 check 81920

Repair:
- health_chars should all be 'A'
- sync_ratio should show progress of "repair"
- sync_action should be "repair"
- IS MISMATCH_CNT INVALID UNTIL A "check" IS RUN AGAIN?
- SHOULD "repair" RESET MISMATCH_CNT WHEN FINISHED?
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
0 10485760 raid raid1 2 AA 655488/10485760 repair 81920
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
0 10485760 raid raid1 2 AA 10485760/10485760 idle 81920

Possible issues:
- We can freeze initialization and then start a check instead.
  SHOULD THIS PRODUCE AN ERROR (like EBUSY)?
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 aa 2715136/10485760 resync 0
[root@bp-01 linux-upstream]# dmsetup message vg-lv 0 frozen
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 aa 5558784/10485760 frozen 0
[root@bp-01 linux-upstream]# dmsetup message vg-lv 0 check
[root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
0 10485760 raid raid1 2 AA 655488/10485760 check 0




^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1 of 2 - v2] MD: Export 'md_reap_sync_thread' function
  2013-03-19 17:15 [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
@ 2013-03-19 17:18 ` Jonathan Brassow
  2013-03-19 17:20 ` [PATCH 2 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
  2013-03-19 23:43 ` [PATCH 0 " NeilBrown
  2 siblings, 0 replies; 4+ messages in thread
From: Jonathan Brassow @ 2013-03-19 17:18 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, agk, jbrassow

MD: Export 'md_reap_sync_thread' function

Make 'md_reap_sync_thread' available to other files, specifically dm-raid.c.
- rename reap_sync_thread to md_reap_sync_thread
- move the fn after md_check_recovery to match md.h declaration placement
- export md_reap_sync_thread

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Index: linux-upstream/drivers/md/md.c
===================================================================
--- linux-upstream.orig/drivers/md/md.c
+++ linux-upstream/drivers/md/md.c
@@ -4225,8 +4225,6 @@ action_show(struct mddev *mddev, char *p
 	return sprintf(page, "%s\n", type);
 }
 
-static void reap_sync_thread(struct mddev *mddev);
-
 static ssize_t
 action_store(struct mddev *mddev, const char *page, size_t len)
 {
@@ -4241,7 +4239,7 @@ action_store(struct mddev *mddev, const
 	if (cmd_match(page, "idle") || cmd_match(page, "frozen")) {
 		if (mddev->sync_thread) {
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-			reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev);
 		}
 	} else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
 		   test_bit(MD_RECOVERY_NEEDED, &mddev->recovery))
@@ -5279,7 +5277,7 @@ static void __md_stop_writes(struct mdde
 	if (mddev->sync_thread) {
 		set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 		set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-		reap_sync_thread(mddev);
+		md_reap_sync_thread(mddev);
 	}
 
 	del_timer_sync(&mddev->safemode_timer);
@@ -7689,51 +7687,6 @@ static int remove_and_add_spares(struct
 	return spares;
 }
 
-static void reap_sync_thread(struct mddev *mddev)
-{
-	struct md_rdev *rdev;
-
-	/* resync has finished, collect result */
-	md_unregister_thread(&mddev->sync_thread);
-	if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
-	    !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
-		/* success...*/
-		/* activate any spares */
-		if (mddev->pers->spare_active(mddev)) {
-			sysfs_notify(&mddev->kobj, NULL,
-				     "degraded");
-			set_bit(MD_CHANGE_DEVS, &mddev->flags);
-		}
-	}
-	if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
-	    mddev->pers->finish_reshape)
-		mddev->pers->finish_reshape(mddev);
-
-	/* If array is no-longer degraded, then any saved_raid_disk
-	 * information must be scrapped.  Also if any device is now
-	 * In_sync we must scrape the saved_raid_disk for that device
-	 * do the superblock for an incrementally recovered device
-	 * written out.
-	 */
-	rdev_for_each(rdev, mddev)
-		if (!mddev->degraded ||
-		    test_bit(In_sync, &rdev->flags))
-			rdev->saved_raid_disk = -1;
-
-	md_update_sb(mddev, 1);
-	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
-	clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
-	clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
-	clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
-	clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
-	/* flag recovery needed just to double check */
-	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-	sysfs_notify_dirent_safe(mddev->sysfs_action);
-	md_new_event(mddev);
-	if (mddev->event_work.func)
-		queue_work(md_misc_wq, &mddev->event_work);
-}
-
 /*
  * This routine is regularly called by all per-raid-array threads to
  * deal with generic issues like resync and super-block update.
@@ -7836,7 +7789,7 @@ void md_check_recovery(struct mddev *mdd
 			goto unlock;
 		}
 		if (mddev->sync_thread) {
-			reap_sync_thread(mddev);
+			md_reap_sync_thread(mddev);
 			goto unlock;
 		}
 		/* Set RUNNING before clearing NEEDED to avoid
@@ -7917,6 +7870,51 @@ void md_check_recovery(struct mddev *mdd
 	}
 }
 
+void md_reap_sync_thread(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+
+	/* resync has finished, collect result */
+	md_unregister_thread(&mddev->sync_thread);
+	if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
+	    !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
+		/* success...*/
+		/* activate any spares */
+		if (mddev->pers->spare_active(mddev)) {
+			sysfs_notify(&mddev->kobj, NULL,
+				     "degraded");
+			set_bit(MD_CHANGE_DEVS, &mddev->flags);
+		}
+	}
+	if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
+	    mddev->pers->finish_reshape)
+		mddev->pers->finish_reshape(mddev);
+
+	/* If array is no-longer degraded, then any saved_raid_disk
+	 * information must be scrapped.  Also if any device is now
+	 * In_sync we must scrape the saved_raid_disk for that device
+	 * do the superblock for an incrementally recovered device
+	 * written out.
+	 */
+	rdev_for_each(rdev, mddev)
+		if (!mddev->degraded ||
+		    test_bit(In_sync, &rdev->flags))
+			rdev->saved_raid_disk = -1;
+
+	md_update_sb(mddev, 1);
+	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
+	clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+	clear_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
+	clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
+	clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
+	/* flag recovery needed just to double check */
+	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	sysfs_notify_dirent_safe(mddev->sysfs_action);
+	md_new_event(mddev);
+	if (mddev->event_work.func)
+		queue_work(md_misc_wq, &mddev->event_work);
+}
+
 void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev)
 {
 	sysfs_notify_dirent_safe(rdev->sysfs_state);
@@ -8642,6 +8640,7 @@ EXPORT_SYMBOL(md_register_thread);
 EXPORT_SYMBOL(md_unregister_thread);
 EXPORT_SYMBOL(md_wakeup_thread);
 EXPORT_SYMBOL(md_check_recovery);
+EXPORT_SYMBOL(md_reap_sync_thread);
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("MD RAID framework");
 MODULE_ALIAS("md");
Index: linux-upstream/drivers/md/md.h
===================================================================
--- linux-upstream.orig/drivers/md/md.h
+++ linux-upstream/drivers/md/md.h
@@ -567,6 +567,7 @@ extern struct md_thread *md_register_thr
 extern void md_unregister_thread(struct md_thread **threadp);
 extern void md_wakeup_thread(struct md_thread *thread);
 extern void md_check_recovery(struct mddev *mddev);
+extern void md_reap_sync_thread(struct mddev *mddev);
 extern void md_write_start(struct mddev *mddev, struct bio *bi);
 extern void md_write_end(struct mddev *mddev);
 extern void md_done_sync(struct mddev *mddev, int blocks, int ok);



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 2 of 2 - v2] DM RAID: Add message/status support for changing sync action
  2013-03-19 17:15 [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
  2013-03-19 17:18 ` [PATCH 1 of 2 - v2] MD: Export 'md_reap_sync_thread' function Jonathan Brassow
@ 2013-03-19 17:20 ` Jonathan Brassow
  2013-03-19 23:43 ` [PATCH 0 " NeilBrown
  2 siblings, 0 replies; 4+ messages in thread
From: Jonathan Brassow @ 2013-03-19 17:20 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, agk, jbrassow

DM RAID:  Add message/status support for changing sync action

This patch adds a message interface to dm-raid to allow the user to more
finely control the sync actions being performed by the MD driver.  This
gives the user the ability to initiate "check" and "repair" (i.e. scrubbing).
Two additional fields have been appended to the status output to provide more
information about the type of sync action occurring and the results of those
actions, specifically: <sync_action> and <mismatch_cnt>.  These new fields
will always be populated.  This is essentially the device-mapper way of doing
what MD controls through the 'sync_action' sysfs file and shows through the
'mismatch_cnt' sysfs file.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -1279,6 +1279,31 @@ static int raid_map(struct dm_target *ti
 	return DM_MAPIO_SUBMITTED;
 }
 
+static const char *decipher_sync_action(struct mddev *mddev)
+{
+	if (test_bit(MD_RECOVERY_FROZEN, &mddev->recovery))
+		return "frozen";
+
+	if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
+	    (!mddev->ro && test_bit(MD_RECOVERY_NEEDED, &mddev->recovery))) {
+		if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
+			return "reshape";
+
+		if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
+			if (!test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
+				return "resync";
+			else if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
+				return "check";
+			return "repair";
+		}
+
+		if (test_bit(MD_RECOVERY_RECOVER, &mddev->recovery))
+			return "recover";
+	}
+
+	return "idle";
+}
+
 static void raid_status(struct dm_target *ti, status_type_t type,
 			unsigned status_flags, char *result, unsigned maxlen)
 {
@@ -1298,8 +1323,18 @@ static void raid_status(struct dm_target
 			sync = rs->md.recovery_cp;
 
 		if (sync >= rs->md.resync_max_sectors) {
+			/*
+			 * Sync complete.
+			 */
 			array_in_sync = 1;
 			sync = rs->md.resync_max_sectors;
+		} else if (test_bit(MD_RECOVERY_REQUESTED, &rs->md.recovery)) {
+			/*
+			 * If "check" or "repair" is occurring, the array has
+			 * undergone and initial sync and the health characters
+			 * should not be 'a' anymore.
+			 */
+			array_in_sync = 1;
 		} else {
 			/*
 			 * The array may be doing an initial sync, or it may
@@ -1311,6 +1346,7 @@ static void raid_status(struct dm_target
 				if (!test_bit(In_sync, &rs->dev[i].rdev.flags))
 					array_in_sync = 1;
 		}
+
 		/*
 		 * Status characters:
 		 *  'D' = Dead/Failed device
@@ -1339,6 +1375,21 @@ static void raid_status(struct dm_target
 		       (unsigned long long) sync,
 		       (unsigned long long) rs->md.resync_max_sectors);
 
+		/*
+		 * Sync action:
+		 *   See Documentation/device-mapper/dm-raid.c for
+		 *   information on each of these states.
+		 */
+		DMEMIT(" %s", decipher_sync_action(&rs->md));
+
+		/*
+		 * resync_mismatches/mismatch_cnt
+		 *   This field shows the number of discrepancies found when
+		 *   performing a "check" of the array.
+		 */
+		DMEMIT(" %llu",
+		       (unsigned long long)
+		       atomic64_read(&rs->md.resync_mismatches));
 		break;
 	case STATUSTYPE_TABLE:
 		/* The string you would use to construct this array */
@@ -1425,7 +1476,62 @@ static void raid_status(struct dm_target
 	}
 }
 
-static int raid_iterate_devices(struct dm_target *ti, iterate_devices_callout_fn fn, void *data)
+static int raid_message(struct dm_target *ti, unsigned argc, char **argv)
+{
+	struct raid_set *rs = ti->private;
+	struct mddev *mddev = &rs->md;
+
+	if (!strcasecmp(argv[0], "reshape")) {
+		DMERR("Reshape not supported.");
+		return -EINVAL;
+	}
+
+	if (!mddev->pers || !mddev->pers->sync_request)
+		return -EINVAL;
+
+	if (!strcasecmp(argv[0], "frozen"))
+		set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+	else
+		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+
+	if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) {
+		if (mddev->sync_thread) {
+			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
+			md_reap_sync_thread(mddev);
+		}
+	} else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
+		   test_bit(MD_RECOVERY_NEEDED, &mddev->recovery))
+		return -EBUSY;
+	else if (!strcasecmp(argv[0], "resync"))
+		set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	else if (!strcasecmp(argv[0], "recover")) {
+		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
+		set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	} else {
+		if (!strcasecmp(argv[0], "check"))
+			set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
+		else if (!!strcasecmp(argv[0], "repair"))
+			return -EINVAL;
+		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
+		set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+	}
+	if (mddev->ro == 2) {
+		/* A write to sync_action is enough to justify
+		 * canceling read-auto mode
+		 */
+		mddev->ro = 0;
+		if (!mddev->suspended)
+			md_wakeup_thread(mddev->sync_thread);
+	}
+	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	if (!mddev->suspended)
+		md_wakeup_thread(mddev->thread);
+
+	return 0;
+}
+
+static int raid_iterate_devices(struct dm_target *ti,
+				iterate_devices_callout_fn fn, void *data)
 {
 	struct raid_set *rs = ti->private;
 	unsigned i;
@@ -1482,12 +1588,13 @@ static void raid_resume(struct dm_target
 
 static struct target_type raid_target = {
 	.name = "raid",
-	.version = {1, 4, 2},
+	.version = {1, 5, 0},
 	.module = THIS_MODULE,
 	.ctr = raid_ctr,
 	.dtr = raid_dtr,
 	.map = raid_map,
 	.status = raid_status,
+	.message = raid_message,
 	.iterate_devices = raid_iterate_devices,
 	.io_hints = raid_io_hints,
 	.presuspend = raid_presuspend,
Index: linux-upstream/Documentation/device-mapper/dm-raid.txt
===================================================================
--- linux-upstream.orig/Documentation/device-mapper/dm-raid.txt
+++ linux-upstream/Documentation/device-mapper/dm-raid.txt
@@ -1,10 +1,13 @@
 dm-raid
--------
+=======
 
 The device-mapper RAID (dm-raid) target provides a bridge from DM to MD.
 It allows the MD RAID drivers to be accessed using a device-mapper
 interface.
 
+
+Mapping Table Interface
+-----------------------
 The target is named "raid" and it accepts the following parameters:
 
   <raid_type> <#raid_params> <raid_params> \
@@ -47,7 +50,7 @@ The target is named "raid" and it accept
     followed by optional parameters (in any order):
 	[sync|nosync]   Force or prevent RAID initialization.
 
-	[rebuild <idx>]	Rebuild drive number idx (first drive is 0).
+	[rebuild <idx>]	Rebuild drive number 'idx' (first drive is 0).
 
 	[daemon_sleep <ms>]
 		Interval between runs of the bitmap daemon that
@@ -56,9 +59,9 @@ The target is named "raid" and it accept
 
 	[min_recovery_rate <kB/sec/disk>]  Throttle RAID initialization
 	[max_recovery_rate <kB/sec/disk>]  Throttle RAID initialization
-	[write_mostly <idx>]		   Drive index is write-mostly
-	[max_write_behind <sectors>]       See '-write-behind=' (man mdadm)
-	[stripe_cache <sectors>]           Stripe cache size (higher RAIDs only)
+	[write_mostly <idx>]		   Mark drive index 'idx' write-mostly.
+	[max_write_behind <sectors>]       See '--write-behind=' (man mdadm)
+	[stripe_cache <sectors>]           Stripe cache size (RAID 4/5/6 only)
 	[region_size <sectors>]
 		The region_size multiplied by the number of regions is the
 		logical size of the array.  The bitmap records the device
@@ -122,7 +125,7 @@ The target is named "raid" and it accept
 	given for both the metadata and data drives for a given position.
 
 
-Example tables
+Example Tables
 --------------
 # RAID4 - 4 data drives, 1 parity (no metadata devices)
 # No metadata devices specified to hold superblock/bitmap info
@@ -141,26 +144,70 @@ Example tables
         raid4 4 2048 sync min_recovery_rate 20 \
         5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
 
+
+Status Output
+-------------
 'dmsetup table' displays the table used to construct the mapping.
 The optional parameters are always printed in the order listed
 above with "sync" or "nosync" always output ahead of the other
 arguments, regardless of the order used when originally loading the table.
 Arguments that can be repeated are ordered by value.
 
-'dmsetup status' yields information on the state and health of the
-array.
-The output is as follows:
+
+'dmsetup status' yields information on the state and health of the array.
+The output is as follows (normally a single line, but expanded here for
+clarity):
 1: <s> <l> raid \
-2:      <raid_type> <#devices> <1 health char for each dev> <resync_ratio>
+2:      <raid_type> <#devices> <health_chars> \
+3:      <sync_ratio> <sync_action> <mismatch_cnt>
 
 Line 1 is the standard output produced by device-mapper.
-Line 2 is produced by the raid target, and best explained by example:
-        0 1960893648 raid raid4 5 AAAAA 2/490221568
+Line 2 & 3 are produced by the raid target and are best explained by example:
+        0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0
 Here we can see the RAID type is raid4, there are 5 devices - all of
-which are 'A'live, and the array is 2/490221568 complete with recovery.
-Faulty or missing devices are marked 'D'.  Devices that are out-of-sync
-are marked 'a'.
-
+which are 'A'live, and the array is 2/490221568 complete with its initial
+recovery.  Here is a fuller description of the individual fields:
+	<raid_type>     Same as the <raid_type> used to create the array.
+	<health_chars>  One char for each device, indicating: 'A' = alive and
+			in-sync, 'a' = alive but not in-sync, 'D' = dead/failed.
+	<sync_ratio>    The ratio indicating how much of the array has undergone
+			the process described by 'sync_action'.  If the
+			'sync_action' is "check" or "repair", then the process
+			of "resync" or "recover" can be considered complete.
+	<sync_action>   One of the following possible states:
+			idle    - No synchronization action is being performed.
+			frozen  - The current action has been halted.
+			resync  - Array is undergoing its initial synchronization
+				  or is resynchronizing after an unclean shutdown
+				  (possibly aided by a bitmap).
+			recover - A device in the array is being rebuilt or
+				  replaced.
+			check   - A user-initiated full check of the array is
+				  being performed.  All blocks are read and
+				  checked for consistency.  The number of
+				  discrepancies found are recorded in
+				  <mismatch_cnt>.  No changes are made to the
+				  array by this action.
+			repair  - The same as "check", but discrepancies are
+				  corrected.
+			reshape - The array is undergoing a reshape.
+	<mismatch_cnt>  The number of discrepancies found between mirror copies
+			in RAID1/10 or wrong parity values found in RAID4/5/6.
+			This value is valid only after a "check" of the array
+			is performed.  A healthy array has a 'mismatch_cnt' of 0.
+
+Message Interface
+-----------------
+The dm-raid target will accept certain actions through the 'message' interface.
+('man dmsetup' for more information on the message interface.)  These actions
+include:
+	"idle"   - Halt the current sync action.
+	"frozen" - Freeze the current sync action.
+	"resync" - Initiate/continue a resync.
+	"recover"- Initiate/continue a recover process.
+	"check"  - Initiate a check (i.e. a "scrub") of the array.
+	"repair" - Initiate a repair of the array.
+	"reshape"- Currently unsupported (-EINVAL).
 
 Version History
 ---------------
@@ -171,4 +218,7 @@ Version History
 1.3.1	Allow device replacement/rebuild for RAID 10
 1.3.2   Fix/improve redundancy checking for RAID10
 1.4.0	Non-functional change.  Removes arg from mapping function.
-1.4.1   Add RAID10 "far" and "offset" algorithm support.
+1.4.1   RAID10 fix redundancy validation checks (commit 55ebbb5).
+1.4.2   Add RAID10 "far" and "offset" algorithm support.
+1.5.0   Add message interface to allow manipulation of the sync_action.
+	New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action
  2013-03-19 17:15 [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
  2013-03-19 17:18 ` [PATCH 1 of 2 - v2] MD: Export 'md_reap_sync_thread' function Jonathan Brassow
  2013-03-19 17:20 ` [PATCH 2 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
@ 2013-03-19 23:43 ` NeilBrown
  2 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2013-03-19 23:43 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: linux-raid, agk

[-- Attachment #1: Type: text/plain, Size: 7267 bytes --]

On Tue, 19 Mar 2013 12:15:52 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:

> Neil,
> 
> I've made the updates which Alasdair suggested, re-worked the patch to
> apply to 3.9.0-rc3 instead of 3.8.0, and split the patch into two
> pieces:
> - export reap_sync_thread: to ensure completion of "idle"/"frozen"
> commands
> - Add message/status support for changing sync action

Thanks - look good.  Hopefully will appear in my -next some time this
week.... unless they should  go through the dm tree(?).

> 
> I still have two questions though:
> 1)  The mismatch_count is only valid after a "check" has been run,
> right?  If I run a "repair", I would expect it to be reset upon
> completion, but it is not - not until "check" has been run again.  Is
> this the expected behavior?

mismatch count is the number of mismatches (in sectors) found in the last
check or repair - and possibly sync depending on how sync is implemented for
that personality.

So you run 'check' and then mismatch_cnt says how many mismatches were
found but not repaired.
Then you run 'repair' and  mismatch_cnt says how many mismatches were
found and repaired, hopefully the same number as with 'check'.
Then you run 'check' again and hopefully mismatch_cnt will be '0'.

i.e. it isn't how many mismatches there "are", only how many were "found".


> 2)  It is possible to issue "frozen" on an array that is undergoing
> "resync" and then issue a "check".  I would expect an EBUSY or
> something.  Additionally, checkpointing seems to be done and seems to
> make it possible to e.g. "resync" the first 25%, "check" the second 25%
> and then finish the last half with a "resync" again.  This would be
> really stupid of the user, but should we catch it?  (This would probably
> be a follow-on patch if "yes".)

There are 3 distinct operations: sync, reshape, and recovery.  Only one of
these can be happening at a time.  If sync or reshape are needed/requested,
they take priority over recovery.  And  sync takes priority over reshape.

There are 3 varieties of sync:
  normal sync
  'check'
  'repair'

normal sync will only process blocks that might not be in sync, so it checks
with the bitmap and the "recovery_cp".  If you try to trigger a sync when
none is needed, then no blocks will be synced.

check and repair ignore the bitmap and the recovery_cp and process every
block, always checking, possibly repairing.

So if a sync is needed and a 'check' is requested then there are several
options:
 - the 'check' request could be rejected
 - those blocks that require 'sync' could get synced anyway, others get
   'checked'.
 - all blocks get 'check'ed, but the bitmap and recovery_cp don't get updated
   so a subsequent 'sync' (which will happen automatically) will fix
   everything that needs fixing.

I don't feel strongly between these, but I do think it would be wrong if the
'check' causes a 'sync' not to happen, but still updated the bitmap and
recovery_cp.

I think I lean slightly toward the third option, but then we would need to be
careful never to allow a 'sync' to start beyond the current 'recover_cp', as
then the accounting would almost certainly get messed up.

So if you would like to examine the code to see exactly what does happen,
propose which of those three you would prefer, and provide a patch which
makes it happen, that would make me quite happy :-)

Thanks,
NeilBrown


> 
> Thanks,
>  brassow
> 
> P.S.  Here is the expected (and tested) output of the various states if
> interested:
> Initial (automated) sync:
> - health_chars should all be 'a'
> - sync_ratio should show progress
> - sync_action should be "resync"
> [root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 aa 5029120/10485760 resync 0
> 
> Nominal state:
> - health_chars should all be 'A'
> - sync_ratio should show 100% (same numerator and denominator)
> - sync_action should be "idle"
> [root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 AA 10485760/10485760 idle 0
> 
> Rebuild/replace a device:
> - health_chars show devices being replaced as 'a', but others as 'A'
> - sync_ratio should show progress
> - sync_action should be "recover"
> [root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
> 0 10485760 raid raid1 2 aA 655488/10485760 recover 0
> 
> Check/scrub:
> - health_chars should all be 'A'
> - sync_ratio should show progress of "check"
> - sync_action should be "check"
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 AA 1310976/10485760 check 0
> 
> Repair:
> - health_chars should all be 'A'
> - sync_ratio should show progress of "repair"
> - sync_action should be "repair"
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 AA 655488/10485760 repair 0
> 
> Check/scrub (when devices differ):
> - health_chars should all be 'A'
> - sync_ratio should show progress of "check"
> - sync_action should be "check"
> - mismatch_cnt should show a count of discrepancies
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
> 0 10485760 raid raid1 2 AA 655488/10485760 check 81920
> 
> Repair:
> - health_chars should all be 'A'
> - sync_ratio should show progress of "repair"
> - sync_action should be "repair"
> - IS MISMATCH_CNT INVALID UNTIL A "check" IS RUN AGAIN?
> - SHOULD "repair" RESET MISMATCH_CNT WHEN FINISHED?
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
> 0 10485760 raid raid1 2 AA 655488/10485760 repair 81920
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:8 254:9 254:5 254:6
> 0 10485760 raid raid1 2 AA 10485760/10485760 idle 81920
> 
> Possible issues:
> - We can freeze initialization and then start a check instead.
>   SHOULD THIS PRODUCE AN ERROR (like EBUSY)?
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 aa 2715136/10485760 resync 0
> [root@bp-01 linux-upstream]# dmsetup message vg-lv 0 frozen
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 aa 5558784/10485760 frozen 0
> [root@bp-01 linux-upstream]# dmsetup message vg-lv 0 check
> [root@bp-01 linux-upstream]# dmsetup table vg-lv; dmsetup status vg-lv
> 0 10485760 raid raid1 3 0 region_size 1024 2 254:3 254:4 254:5 254:6
> 0 10485760 raid raid1 2 AA 655488/10485760 check 0
> 
> 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-03-19 23:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-19 17:15 [PATCH 0 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
2013-03-19 17:18 ` [PATCH 1 of 2 - v2] MD: Export 'md_reap_sync_thread' function Jonathan Brassow
2013-03-19 17:20 ` [PATCH 2 of 2 - v2] DM RAID: Add message/status support for changing sync action Jonathan Brassow
2013-03-19 23:43 ` [PATCH 0 " NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).