[PATCH] DM RAID: Add ability to restore transiently failed devices on resume

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] DM RAID: Add ability to restore transiently failed devices on resume
@ 2013-04-11 20:27 Jonathan Brassow
  2013-04-22  0:43 ` NeilBrown
  2013-05-02 19:19 ` [PATCH - v2] " Jonathan Brassow
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Brassow @ 2013-04-11 20:27 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, jbrassow

DM RAID: Add ability to restore transiently failed devices on resume

This patch adds code to the resume function to check over the devices
in the RAID array.  If any are found to be marked as failed and their
superblocks can be read, an attempt is made to reintegrate them into
the array.  This allows the user to refresh the array with a simple
suspend and resume of the array - rather than having to load a
completely new table, allocate and initialize all the structures and
throw away the old instantiation.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -1574,12 +1574,56 @@ static void raid_postsuspend(struct dm_t
 
 static void raid_resume(struct dm_target *ti)
 {
+	int i;
+	uint64_t failed_devices, cleared_failed_devices = 0;
+	struct dm_raid_superblock *sb;
 	struct raid_set *rs = ti->private;
+	struct md_rdev *r;
 
 	set_bit(MD_CHANGE_DEVS, &rs->md.flags);
 	if (!rs->bitmap_loaded) {
 		bitmap_load(&rs->md);
 		rs->bitmap_loaded = 1;
+	} else {
+		/*
+		 * A secondary resume while the device is active.
+		 * Take this opportunity to check whether any failed
+		 * devices are reachable again.
+		 */
+		for (i = 0; i < rs->md.raid_disks; i++) {
+			r = &rs->dev[i].rdev;
+			if (test_bit(Faulty, &r->flags) && r->sb_page &&
+			    !read_disk_sb(r, r->sb_size)) {
+				DMINFO("Faulty device #%d has readable super"
+				       "block.  Attempting to revive it.", i);
+				r->raid_disk = i;
+				r->saved_raid_disk = i;
+				clear_bit(Faulty, &r->flags);
+				clear_bit(WriteErrorSeen, &r->flags);
+				clear_bit(In_sync, &r->flags);
+				if (r->mddev->pers->hot_add_disk(r->mddev, r)) {
+					r->raid_disk = -1;
+					r->saved_raid_disk = -1;
+					set_bit(Faulty, &r->flags);
+					set_bit(In_sync, &r->flags);
+				} else {
+					r->recovery_offset = 0;
+					cleared_failed_devices |= 1 << i;
+					set_bit(MD_RECOVERY_RECOVER,
+						&rs->md.recovery);
+					set_bit(MD_RECOVERY_NEEDED,
+						&rs->md.recovery);
+				}
+			}
+		}
+		if (cleared_failed_devices) {
+			rdev_for_each(r, &rs->md) {
+				sb = page_address(r->sb_page);
+				failed_devices = le64_to_cpu(sb->failed_devices);
+				failed_devices &= ~cleared_failed_devices;
+				sb->failed_devices = cpu_to_le64(failed_devices);
+			}
+		}
 	}
 
 	clear_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
@@ -1588,7 +1632,7 @@ static void raid_resume(struct dm_target
 
 static struct target_type raid_target = {
 	.name = "raid",
-	.version = {1, 5, 0},
+	.version = {1, 5, 1},
 	.module = THIS_MODULE,
 	.ctr = raid_ctr,
 	.dtr = raid_dtr,
Index: linux-upstream/drivers/md/raid1.c
===================================================================
--- linux-upstream.orig/drivers/md/raid1.c
+++ linux-upstream/drivers/md/raid1.c
@@ -1518,8 +1518,9 @@ static int raid1_add_disk(struct mddev *
 		p = conf->mirrors+mirror;
 		if (!p->rdev) {
 
-			disk_stack_limits(mddev->gendisk, rdev->bdev,
-					  rdev->data_offset << 9);
+			if (mddev->gendisk)
+				disk_stack_limits(mddev->gendisk, rdev->bdev,
+						  rdev->data_offset << 9);
 
 			p->head_position = 0;
 			rdev->raid_disk = mirror;
@@ -1558,7 +1559,7 @@ static int raid1_add_disk(struct mddev *
 		clear_bit(Unmerged, &rdev->flags);
 	}
 	md_integrity_add_rdev(rdev, mddev);
-	if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
+	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 	print_conf(conf);
 	return err;
Index: linux-upstream/drivers/md/raid10.c
===================================================================
--- linux-upstream.orig/drivers/md/raid10.c
+++ linux-upstream/drivers/md/raid10.c
@@ -1806,15 +1806,17 @@ static int raid10_add_disk(struct mddev
 			set_bit(Replacement, &rdev->flags);
 			rdev->raid_disk = mirror;
 			err = 0;
-			disk_stack_limits(mddev->gendisk, rdev->bdev,
-					  rdev->data_offset << 9);
+			if (mddev->gendisk)
+				disk_stack_limits(mddev->gendisk, rdev->bdev,
+						  rdev->data_offset << 9);
 			conf->fullsync = 1;
 			rcu_assign_pointer(p->replacement, rdev);
 			break;
 		}
 
-		disk_stack_limits(mddev->gendisk, rdev->bdev,
-				  rdev->data_offset << 9);
+		if (mddev->gendisk)
+			disk_stack_limits(mddev->gendisk, rdev->bdev,
+					  rdev->data_offset << 9);
 
 		p->head_position = 0;
 		p->recovery_disabled = mddev->recovery_disabled - 1;
Index: linux-upstream/Documentation/device-mapper/dm-raid.txt
===================================================================
--- linux-upstream.orig/Documentation/device-mapper/dm-raid.txt
+++ linux-upstream/Documentation/device-mapper/dm-raid.txt
@@ -222,3 +222,4 @@ Version History
 1.4.2   Add RAID10 "far" and "offset" algorithm support.
 1.5.0   Add message interface to allow manipulation of the sync_action.
 	New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
+1.5.1   Add ability to restore transiently failed devices on resume.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] DM RAID: Add ability to restore transiently failed devices on resume
  2013-04-11 20:27 [PATCH] DM RAID: Add ability to restore transiently failed devices on resume Jonathan Brassow
@ 2013-04-22  0:43 ` NeilBrown
  2013-04-22 18:57   ` Brassow Jonathan
  2013-05-02 19:19 ` [PATCH - v2] " Jonathan Brassow
  1 sibling, 1 reply; 7+ messages in thread
From: NeilBrown @ 2013-04-22  0:43 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]

On Thu, 11 Apr 2013 15:27:03 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:

> DM RAID: Add ability to restore transiently failed devices on resume
> 
> This patch adds code to the resume function to check over the devices
> in the RAID array.  If any are found to be marked as failed and their
> superblocks can be read, an attempt is made to reintegrate them into
> the array.  This allows the user to refresh the array with a simple
> suspend and resume of the array - rather than having to load a
> completely new table, allocate and initialize all the structures and
> throw away the old instantiation.
> 
> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Is this really a good idea?  
Just because you can read the superblock, that doesn't mean you can read any
other block in the array.

If the auto-recovery were optional I might consider it, but having it be
enforced on every suspend/resume just doesn't seem safe to me.

Have I misunderstood?

Is this that hard to achieve in user-space?

> +			if (test_bit(Faulty, &r->flags) && r->sb_page &&
> +			    !read_disk_sb(r, r->sb_size)) {

I know it is fairly widely used, especially for 'strcmp', but I absolutely
despise this construct.
  !read_disk_sb()
sound like
  "not read_disk_sb" or "could not read the disk's superblock" but it
actually means "did not fail to read the disk's superblock" - the exact
reverse.
If 0 means success, I like to see an explicit test for 0:
   read_disk_sb() == 0

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] DM RAID: Add ability to restore transiently failed devices on resume
  2013-04-22  0:43 ` NeilBrown
@ 2013-04-22 18:57   ` Brassow Jonathan
  2013-04-24  6:39     ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Brassow Jonathan @ 2013-04-22 18:57 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-raid@vger.kernel.org Raid, Alasdair Kergon,
	Jonathan Brassow

On Apr 21, 2013, at 7:43 PM, NeilBrown wrote:

> On Thu, 11 Apr 2013 15:27:03 -0500 Jonathan Brassow <jbrassow@redhat.com>
> wrote:
> 
>> DM RAID: Add ability to restore transiently failed devices on resume
>> 
>> This patch adds code to the resume function to check over the devices
>> in the RAID array.  If any are found to be marked as failed and their
>> superblocks can be read, an attempt is made to reintegrate them into
>> the array.  This allows the user to refresh the array with a simple
>> suspend and resume of the array - rather than having to load a
>> completely new table, allocate and initialize all the structures and
>> throw away the old instantiation.
>> 
>> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
> 
> Is this really a good idea?  
> Just because you can read the superblock, that doesn't mean you can read any
> other block in the array.

True.  To some extent, we are relying on user-space here.  The normal life of a DM device would consist of:
1) constructor (CTR)
2) resume
3) suspend (sometimes skipped)
4) destructor (DTR)

It is pretty rare to get a secondary 'resume'.  A DM device is normally suspended so that a different table can be loaded - in which case, the first instance is destroyed (DTR) and the second instance undergoes its first resume.  Another case that does happen, albeit rarely, is a "refresh" (e.g. 'lvchange --refresh vg/lv').  This operation performs a suspend/resume for the purpose of refreshing the in-kernel target.  This is the situation that I am trying to instrument with this patch.  You can see from the patch that it is only secondary 'resumes' where the check is made.

To say that there is no other case where a secondary resume would be done would be silly.  I can't account for every case.  However, the user must be resuming an existing target after a suspend (not a CTR) which is somewhat rare.  There must be a failed device in the array for the subsequent superblock read to be performed and that must succeed for the device to attempt revival.  If the device is really still in bad shape, it will fall back to 'Faulty'.

The approach I had been using in user-space was to load a completely new (and identical) mapping table.  The old table and structures were discarded and all the state of the devices was lost.  The new table was instantiated with new structures and all devices assumed to be fine.  (It would, of course, be noticed that a device had failed and must be recovered, but it would be assumed that it was alive and able to do so.)  User-space would do this partially based on the fact that the LVM label could be read from the disk and partially from the fact that the user had made the decision to run the 'lvchange --refresh' command in the first place.  This seems to be far more messy than possibly reviving the device in the 'resume'.

Another possibility is to put this functionality into the 'message' interface.  Any 'refresh' command could use that.  Then, the cases I am unable to account for that issue secondary resumes would not also have the side effect of trying to revive any failed devices that may exist in the array who just happen to have readable superblocks.  However, it is nice to have the device suspended when the device revival is attempted.  If we put the revival code into the 'message' function, I may have to wrap the revival code with a suspend and resume anyway - not sure yet.

Personally, I like having the revival code in the 'resume' over the 'message' function.  I'll make sure Alasdair gets a copy of this message to see if he has an opinion.

> 
> If the auto-recovery were optional I might consider it, but having it be
> enforced on every suspend/resume just doesn't seem safe to me.
> 
> Have I misunderstood?
> 
> Is this that hard to achieve in user-space?
> 
> 
>> +			if (test_bit(Faulty, &r->flags) && r->sb_page &&
>> +			    !read_disk_sb(r, r->sb_size)) {
> 
> I know it is fairly widely used, especially for 'strcmp', but I absolutely
> despise this construct.
>  !read_disk_sb()
> sound like
>  "not read_disk_sb" or "could not read the disk's superblock" but it
> actually means "did not fail to read the disk's superblock" - the exact
> reverse.
> If 0 means success, I like to see an explicit test for 0:
>   read_disk_sb() == 0

Ok, I'll try to remember that.  This issue has solved itself though, because I must switch to using sync_page_io directly anyway.  I'll have to post another patch whether we proceed or decide to change where we put this code.

 brassow

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] DM RAID: Add ability to restore transiently failed devices on resume
  2013-04-22 18:57   ` Brassow Jonathan
@ 2013-04-24  6:39     ` NeilBrown
  0 siblings, 0 replies; 7+ messages in thread
From: NeilBrown @ 2013-04-24  6:39 UTC (permalink / raw)
  To: Brassow Jonathan; +Cc: linux-raid@vger.kernel.org Raid, Alasdair Kergon

[-- Attachment #1: Type: text/plain, Size: 5044 bytes --]

On Mon, 22 Apr 2013 13:57:03 -0500 Brassow Jonathan <jbrassow@redhat.com>
wrote:

> 
> On Apr 21, 2013, at 7:43 PM, NeilBrown wrote:
> 
> > On Thu, 11 Apr 2013 15:27:03 -0500 Jonathan Brassow <jbrassow@redhat.com>
> > wrote:
> > 
> >> DM RAID: Add ability to restore transiently failed devices on resume
> >> 
> >> This patch adds code to the resume function to check over the devices
> >> in the RAID array.  If any are found to be marked as failed and their
> >> superblocks can be read, an attempt is made to reintegrate them into
> >> the array.  This allows the user to refresh the array with a simple
> >> suspend and resume of the array - rather than having to load a
> >> completely new table, allocate and initialize all the structures and
> >> throw away the old instantiation.
> >> 
> >> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
> > 
> > Is this really a good idea?  
> > Just because you can read the superblock, that doesn't mean you can read any
> > other block in the array.
> 
> True.  To some extent, we are relying on user-space here.  The normal life of a DM device would consist of:
> 1) constructor (CTR)
> 2) resume
> 3) suspend (sometimes skipped)
> 4) destructor (DTR)
> 
> It is pretty rare to get a secondary 'resume'.  A DM device is normally suspended so that a different table can be loaded - in which case, the first instance is destroyed (DTR) and the second instance undergoes its first resume.  Another case that does happen, albeit rarely, is a "refresh" (e.g. 'lvchange --refresh vg/lv').  This operation performs a suspend/resume for the purpose of refreshing the in-kernel target.  This is the situation that I am trying to instrument with this patch.  You can see from the patch that it is only secondary 'resumes' where the check is made.
> 
> To say that there is no other case where a secondary resume would be done would be silly.  I can't account for every case.  However, the user must be resuming an existing target after a suspend (not a CTR) which is somewhat rare.  There must be a failed device in the array for the subsequent superblock read to be performed and that must succeed for the device to attempt revival.  If the device is really still in bad shape, it will fall back to 'Faulty'.
> 
> The approach I had been using in user-space was to load a completely new (and identical) mapping table.  The old table and structures were discarded and all the state of the devices was lost.  The new table was instantiated with new structures and all devices assumed to be fine.  (It would, of course, be noticed that a device had failed and must be recovered, but it would be assumed that it was alive and able to do so.)  User-space would do this partially based on the fact that the LVM label could be read from the disk and partially from the fact that the user had made the decision to run the 'lvchange --refresh' command in the first place.  This seems to be far more messy than possibly reviving the device in the 'resume'.
> 
> Another possibility is to put this functionality into the 'message' interface.  Any 'refresh' command could use that.  Then, the cases I am unable to account for that issue secondary resumes would not also have the side effect of trying to revive any failed devices that may exist in the array who just happen to have readable superblocks.  However, it is nice to have the device suspended when the device revival is attempted.  If we put the revival code into the 'message' function, I may have to wrap the revival code with a suspend and resume anyway - not sure yet.
> 
> Personally, I like having the revival code in the 'resume' over the 'message' function.  I'll make sure Alasdair gets a copy of this message to see if he has an opinion.

Thanks.  You explanation does much to allay my concerns.
If Alasdair doesn't object to a secondary resume always attempted to
reactivate any devices that have been marked as faulty, then I won't object
either.

Thanks,
NeilBrown


> 
> > 
> > If the auto-recovery were optional I might consider it, but having it be
> > enforced on every suspend/resume just doesn't seem safe to me.
> > 
> > Have I misunderstood?
> > 
> > Is this that hard to achieve in user-space?
> > 
> > 
> >> +			if (test_bit(Faulty, &r->flags) && r->sb_page &&
> >> +			    !read_disk_sb(r, r->sb_size)) {
> > 
> > I know it is fairly widely used, especially for 'strcmp', but I absolutely
> > despise this construct.
> >  !read_disk_sb()
> > sound like
> >  "not read_disk_sb" or "could not read the disk's superblock" but it
> > actually means "did not fail to read the disk's superblock" - the exact
> > reverse.
> > If 0 means success, I like to see an explicit test for 0:
> >   read_disk_sb() == 0
> 
> Ok, I'll try to remember that.  This issue has solved itself though, because I must switch to using sync_page_io directly anyway.  I'll have to post another patch whether we proceed or decide to change where we put this code.
> 
>  brassow


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH - v2] DM RAID: Add ability to restore transiently failed devices on resume
  2013-04-11 20:27 [PATCH] DM RAID: Add ability to restore transiently failed devices on resume Jonathan Brassow
  2013-04-22  0:43 ` NeilBrown
@ 2013-05-02 19:19 ` Jonathan Brassow
  2013-05-06  6:00   ` NeilBrown
  1 sibling, 1 reply; 7+ messages in thread
From: Jonathan Brassow @ 2013-05-02 19:19 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb, agk, jbrassow

DM RAID: Add ability to restore transiently failed devices on resume

This patch adds code to the resume function to check over the devices
in the RAID array.  If any are found to be marked as failed and their
superblocks can be read, an attempt is made to reintegrate them into
the array.  This allows the user to refresh the array with a simple
suspend and resume of the array - rather than having to load a
completely new table, allocate and initialize all the structures and
throw away the old instantiation.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Index: linux-upstream/drivers/md/dm-raid.c
===================================================================
--- linux-upstream.orig/drivers/md/dm-raid.c
+++ linux-upstream/drivers/md/dm-raid.c
@@ -1574,12 +1574,54 @@ static void raid_postsuspend(struct dm_t
 
 static void raid_resume(struct dm_target *ti)
 {
+	int i;
+	uint64_t failed_devices, cleared_failed_devices = 0;
+	unsigned long flags;
+	struct dm_raid_superblock *sb;
 	struct raid_set *rs = ti->private;
+	struct md_rdev *r;
 
 	set_bit(MD_CHANGE_DEVS, &rs->md.flags);
 	if (!rs->bitmap_loaded) {
 		bitmap_load(&rs->md);
 		rs->bitmap_loaded = 1;
+	} else {
+		/*
+		 * A secondary resume while the device is active.
+		 * Take this opportunity to check whether any failed
+		 * devices are reachable again.
+		 */
+		for (i = 0; i < rs->md.raid_disks; i++) {
+			r = &rs->dev[i].rdev;
+			if (test_bit(Faulty, &r->flags) && r->sb_page &&
+			    sync_page_io(r, 0, r->sb_size,
+					 r->sb_page, READ, 1)) {
+				DMINFO("Faulty device #%d has readable super"
+				       "block.  Attempting to revive it.", i);
+				r->raid_disk = i;
+				r->saved_raid_disk = i;
+				flags = r->flags;
+				clear_bit(Faulty, &r->flags);
+				clear_bit(WriteErrorSeen, &r->flags);
+				clear_bit(In_sync, &r->flags);
+				if (r->mddev->pers->hot_add_disk(r->mddev, r)) {
+					r->raid_disk = -1;
+					r->saved_raid_disk = -1;
+					r->flags = flags;
+				} else {
+					r->recovery_offset = 0;
+					cleared_failed_devices |= 1 << i;
+				}
+			}
+		}
+		if (cleared_failed_devices) {
+			rdev_for_each(r, &rs->md) {
+				sb = page_address(r->sb_page);
+				failed_devices = le64_to_cpu(sb->failed_devices);
+				failed_devices &= ~cleared_failed_devices;
+				sb->failed_devices = cpu_to_le64(failed_devices);
+			}
+		}
 	}
 
 	clear_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
@@ -1588,7 +1630,7 @@ static void raid_resume(struct dm_target
 
 static struct target_type raid_target = {
 	.name = "raid",
-	.version = {1, 5, 0},
+	.version = {1, 5, 1},
 	.module = THIS_MODULE,
 	.ctr = raid_ctr,
 	.dtr = raid_dtr,
Index: linux-upstream/drivers/md/raid1.c
===================================================================
--- linux-upstream.orig/drivers/md/raid1.c
+++ linux-upstream/drivers/md/raid1.c
@@ -1518,8 +1518,9 @@ static int raid1_add_disk(struct mddev *
 		p = conf->mirrors+mirror;
 		if (!p->rdev) {
 
-			disk_stack_limits(mddev->gendisk, rdev->bdev,
-					  rdev->data_offset << 9);
+			if (mddev->gendisk)
+				disk_stack_limits(mddev->gendisk, rdev->bdev,
+						  rdev->data_offset << 9);
 
 			p->head_position = 0;
 			rdev->raid_disk = mirror;
@@ -1558,7 +1559,7 @@ static int raid1_add_disk(struct mddev *
 		clear_bit(Unmerged, &rdev->flags);
 	}
 	md_integrity_add_rdev(rdev, mddev);
-	if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
+	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
 		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
 	print_conf(conf);
 	return err;
Index: linux-upstream/drivers/md/raid10.c
===================================================================
--- linux-upstream.orig/drivers/md/raid10.c
+++ linux-upstream/drivers/md/raid10.c
@@ -1806,15 +1806,17 @@ static int raid10_add_disk(struct mddev
 			set_bit(Replacement, &rdev->flags);
 			rdev->raid_disk = mirror;
 			err = 0;
-			disk_stack_limits(mddev->gendisk, rdev->bdev,
-					  rdev->data_offset << 9);
+			if (mddev->gendisk)
+				disk_stack_limits(mddev->gendisk, rdev->bdev,
+						  rdev->data_offset << 9);
 			conf->fullsync = 1;
 			rcu_assign_pointer(p->replacement, rdev);
 			break;
 		}
 
-		disk_stack_limits(mddev->gendisk, rdev->bdev,
-				  rdev->data_offset << 9);
+		if (mddev->gendisk)
+			disk_stack_limits(mddev->gendisk, rdev->bdev,
+					  rdev->data_offset << 9);
 
 		p->head_position = 0;
 		p->recovery_disabled = mddev->recovery_disabled - 1;
Index: linux-upstream/Documentation/device-mapper/dm-raid.txt
===================================================================
--- linux-upstream.orig/Documentation/device-mapper/dm-raid.txt
+++ linux-upstream/Documentation/device-mapper/dm-raid.txt
@@ -222,3 +222,4 @@ Version History
 1.4.2   Add RAID10 "far" and "offset" algorithm support.
 1.5.0   Add message interface to allow manipulation of the sync_action.
 	New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
+1.5.1   Add ability to restore transiently failed devices on resume.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH - v2] DM RAID: Add ability to restore transiently failed devices on resume
  2013-05-02 19:19 ` [PATCH - v2] " Jonathan Brassow
@ 2013-05-06  6:00   ` NeilBrown
  2013-05-06 14:55     ` Brassow Jonathan
  0 siblings, 1 reply; 7+ messages in thread
From: NeilBrown @ 2013-05-06  6:00 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: linux-raid, agk

[-- Attachment #1: Type: text/plain, Size: 5580 bytes --]

On Thu, 02 May 2013 14:19:24 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:

> DM RAID: Add ability to restore transiently failed devices on resume
> 
> This patch adds code to the resume function to check over the devices
> in the RAID array.  If any are found to be marked as failed and their
> superblocks can be read, an attempt is made to reintegrate them into
> the array.  This allows the user to refresh the array with a simple
> suspend and resume of the array - rather than having to load a
> completely new table, allocate and initialize all the structures and
> throw away the old instantiation.
> 
> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
> 
> Index: linux-upstream/drivers/md/dm-raid.c
> ===================================================================
> --- linux-upstream.orig/drivers/md/dm-raid.c
> +++ linux-upstream/drivers/md/dm-raid.c
> @@ -1574,12 +1574,54 @@ static void raid_postsuspend(struct dm_t
>  
>  static void raid_resume(struct dm_target *ti)
>  {
> +	int i;
> +	uint64_t failed_devices, cleared_failed_devices = 0;
> +	unsigned long flags;
> +	struct dm_raid_superblock *sb;
>  	struct raid_set *rs = ti->private;
> +	struct md_rdev *r;
>  
>  	set_bit(MD_CHANGE_DEVS, &rs->md.flags);
>  	if (!rs->bitmap_loaded) {
>  		bitmap_load(&rs->md);
>  		rs->bitmap_loaded = 1;
> +	} else {
> +		/*
> +		 * A secondary resume while the device is active.
> +		 * Take this opportunity to check whether any failed
> +		 * devices are reachable again.
> +		 */
> +		for (i = 0; i < rs->md.raid_disks; i++) {
> +			r = &rs->dev[i].rdev;
> +			if (test_bit(Faulty, &r->flags) && r->sb_page &&
> +			    sync_page_io(r, 0, r->sb_size,
> +					 r->sb_page, READ, 1)) {
> +				DMINFO("Faulty device #%d has readable super"
> +				       "block.  Attempting to revive it.", i);
> +				r->raid_disk = i;
> +				r->saved_raid_disk = i;
> +				flags = r->flags;
> +				clear_bit(Faulty, &r->flags);
> +				clear_bit(WriteErrorSeen, &r->flags);
> +				clear_bit(In_sync, &r->flags);
> +				if (r->mddev->pers->hot_add_disk(r->mddev, r)) {
> +					r->raid_disk = -1;
> +					r->saved_raid_disk = -1;
> +					r->flags = flags;
> +				} else {
> +					r->recovery_offset = 0;
> +					cleared_failed_devices |= 1 << i;
> +				}
> +			}
> +		}
> +		if (cleared_failed_devices) {
> +			rdev_for_each(r, &rs->md) {
> +				sb = page_address(r->sb_page);
> +				failed_devices = le64_to_cpu(sb->failed_devices);
> +				failed_devices &= ~cleared_failed_devices;
> +				sb->failed_devices = cpu_to_le64(failed_devices);
> +			}
> +		}
>  	}
>  
>  	clear_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
> @@ -1588,7 +1630,7 @@ static void raid_resume(struct dm_target
>  
>  static struct target_type raid_target = {
>  	.name = "raid",
> -	.version = {1, 5, 0},
> +	.version = {1, 5, 1},
>  	.module = THIS_MODULE,
>  	.ctr = raid_ctr,
>  	.dtr = raid_dtr,
> Index: linux-upstream/drivers/md/raid1.c
> ===================================================================
> --- linux-upstream.orig/drivers/md/raid1.c
> +++ linux-upstream/drivers/md/raid1.c
> @@ -1518,8 +1518,9 @@ static int raid1_add_disk(struct mddev *
>  		p = conf->mirrors+mirror;
>  		if (!p->rdev) {
>  
> -			disk_stack_limits(mddev->gendisk, rdev->bdev,
> -					  rdev->data_offset << 9);
> +			if (mddev->gendisk)
> +				disk_stack_limits(mddev->gendisk, rdev->bdev,
> +						  rdev->data_offset << 9);
>  
>  			p->head_position = 0;
>  			rdev->raid_disk = mirror;
> @@ -1558,7 +1559,7 @@ static int raid1_add_disk(struct mddev *
>  		clear_bit(Unmerged, &rdev->flags);
>  	}
>  	md_integrity_add_rdev(rdev, mddev);
> -	if (blk_queue_discard(bdev_get_queue(rdev->bdev)))
> +	if (mddev->queue && blk_queue_discard(bdev_get_queue(rdev->bdev)))
>  		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
>  	print_conf(conf);
>  	return err;
> Index: linux-upstream/drivers/md/raid10.c
> ===================================================================
> --- linux-upstream.orig/drivers/md/raid10.c
> +++ linux-upstream/drivers/md/raid10.c
> @@ -1806,15 +1806,17 @@ static int raid10_add_disk(struct mddev
>  			set_bit(Replacement, &rdev->flags);
>  			rdev->raid_disk = mirror;
>  			err = 0;
> -			disk_stack_limits(mddev->gendisk, rdev->bdev,
> -					  rdev->data_offset << 9);
> +			if (mddev->gendisk)
> +				disk_stack_limits(mddev->gendisk, rdev->bdev,
> +						  rdev->data_offset << 9);
>  			conf->fullsync = 1;
>  			rcu_assign_pointer(p->replacement, rdev);
>  			break;
>  		}
>  
> -		disk_stack_limits(mddev->gendisk, rdev->bdev,
> -				  rdev->data_offset << 9);
> +		if (mddev->gendisk)
> +			disk_stack_limits(mddev->gendisk, rdev->bdev,
> +					  rdev->data_offset << 9);
>  
>  		p->head_position = 0;
>  		p->recovery_disabled = mddev->recovery_disabled - 1;
> Index: linux-upstream/Documentation/device-mapper/dm-raid.txt
> ===================================================================
> --- linux-upstream.orig/Documentation/device-mapper/dm-raid.txt
> +++ linux-upstream/Documentation/device-mapper/dm-raid.txt
> @@ -222,3 +222,4 @@ Version History
>  1.4.2   Add RAID10 "far" and "offset" algorithm support.
>  1.5.0   Add message interface to allow manipulation of the sync_action.
>  	New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
> +1.5.1   Add ability to restore transiently failed devices on resume.
> 


Applied thanks.  I assume this is heading for 3.11 ?

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH - v2] DM RAID: Add ability to restore transiently failed devices on resume
  2013-05-06  6:00   ` NeilBrown
@ 2013-05-06 14:55     ` Brassow Jonathan
  0 siblings, 0 replies; 7+ messages in thread
From: Brassow Jonathan @ 2013-05-06 14:55 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, agk


On May 6, 2013, at 1:00 AM, NeilBrown wrote:

> On Thu, 02 May 2013 14:19:24 -0500 Jonathan Brassow <jbrassow@redhat.com>
> wrote:
> 
>> DM RAID: Add ability to restore transiently failed devices on resume
>> 
>> This patch adds code to the resume function to check over the devices
>> in the RAID array.  If any are found to be marked as failed and their
>> superblocks can be read, an attempt is made to reintegrate them into
>> the array.  This allows the user to refresh the array with a simple
>> suspend and resume of the array - rather than having to load a
>> completely new table, allocate and initialize all the structures and
>> throw away the old instantiation.
>> 
>> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
>> 

<snip/>

>> +1.5.1   Add ability to restore transiently failed devices on resume.
>> 
> 
> 
> Applied thanks.  I assume this is heading for 3.11 ?

3.11 is fine.  thanks,
 brassow


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-05-06 14:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-11 20:27 [PATCH] DM RAID: Add ability to restore transiently failed devices on resume Jonathan Brassow
2013-04-22  0:43 ` NeilBrown
2013-04-22 18:57   ` Brassow Jonathan
2013-04-24  6:39     ` NeilBrown
2013-05-02 19:19 ` [PATCH - v2] " Jonathan Brassow
2013-05-06  6:00   ` NeilBrown
2013-05-06 14:55     ` Brassow Jonathan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).