[PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
       [not found] <20061005171233.6542.patches@notabene>
@ 2006-10-05  7:13 ` NeilBrown
  2006-10-05 19:26   ` Eli Stair
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2006-10-05  7:13 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel

There is a nasty bug in md in 2.6.18 affecting at least raid1.
This fixes it (and has already been sent to stable@kernel.org).

### Comments for Changeset

This fixes a bug introduced in 2.6.18. 

If a drive is added to a raid1 using older tools (mdadm-1.x or
raidtools) then it will be included in the array without any resync
happening.

It has been submitted for 2.6.18.1.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c |    1 +
 1 file changed, 1 insertion(+)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-09-29 11:51:39.000000000 +1000
+++ ./drivers/md/md.c	2006-10-05 16:40:51.000000000 +1000
@@ -3849,6 +3849,7 @@ static int hot_add_disk(mddev_t * mddev,
 	}
 	clear_bit(In_sync, &rdev->flags);
 	rdev->desc_nr = -1;
+	rdev->saved_raid_disk = -1;
 	err = bind_rdev_to_array(rdev, mddev);
 	if (err)
 		goto abort_export;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-05  7:13 ` [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly NeilBrown
@ 2006-10-05 19:26   ` Eli Stair
  2006-10-06 22:42     ` Eli Stair
  0 siblings, 1 reply; 11+ messages in thread
From: Eli Stair @ 2006-10-05 19:26 UTC (permalink / raw)
  To: linux-raid

I'm actually seeing similar behaviour on RAID10 (2.6.18), where after 
removing a drive from an array re-adding it sometimes results in it 
still being listed as a faulty-spare and not being "taken" for resync. 
In the same scenario, after swapping drives, doing a fail,remove, then 
an 'add' doesn't work, only a re-add will even get the drive listed by 
MDADM.

What's the failure mode/symptoms that this patch is resolving?

Is it possible this affects the RAID10 module/mode as well?  If not, 
I'll start a new thread for that.  I'm testing this patch to see if it 
does remedy the situation on RAID10, and will update after some 
significant testing.

/eli

NeilBrown wrote:
> There is a nasty bug in md in 2.6.18 affecting at least raid1.
> This fixes it (and has already been sent to stable@kernel.org).
> 
> ### Comments for Changeset
> 
> This fixes a bug introduced in 2.6.18.
> 
> If a drive is added to a raid1 using older tools (mdadm-1.x or
> raidtools) then it will be included in the array without any resync
> happening.
> 
> It has been submitted for 2.6.18.1.
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./drivers/md/md.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff .prev/drivers/md/md.c ./drivers/md/md.c
> --- .prev/drivers/md/md.c       2006-09-29 11:51:39.000000000 +1000
> +++ ./drivers/md/md.c   2006-10-05 16:40:51.000000000 +1000
> @@ -3849,6 +3849,7 @@ static int hot_add_disk(mddev_t * mddev,
>         }
>         clear_bit(In_sync, &rdev->flags);
>         rdev->desc_nr = -1;
> +       rdev->saved_raid_disk = -1;
>         err = bind_rdev_to_array(rdev, mddev);
>         if (err)
>                 goto abort_export;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-05 19:26   ` Eli Stair
@ 2006-10-06 22:42     ` Eli Stair
  2006-10-10  2:00       ` Neil Brown
  2006-10-10 20:20       ` Eli Stair
  0 siblings, 2 replies; 11+ messages in thread
From: Eli Stair @ 2006-10-06 22:42 UTC (permalink / raw)
  To: linux-raid


This patch has resolved the immediate issue I was having on 2.6.18 with 
RAID10.  Previous to this change, after removing a device from the array 
(with mdadm --remove), physically pulling the device and 
changing/re-inserting, the "Number" of the new device would be 
incremented on top of the highest-present device in the array.  Now, it 
resumes its previous place.

Does this look to be 'correct' output for a 14-drive array, which dev 8 
was failed/removed from then "add"'ed?  I'm trying to determine why the 
device doesn't get pulled back into the active configuration and 
re-synced.  Any comments?

Thanks!

/eli

For example, currently when device dm-8 is removed it shows up like this:



     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2     253        2        2      active sync   /dev/dm-2
        3     253        3        3      active sync   /dev/dm-3
        4     253        4        4      active sync   /dev/dm-4
        5     253        5        5      active sync   /dev/dm-5
        6     253        6        6      active sync   /dev/dm-6
        7     253        7        7      active sync   /dev/dm-7
        8       0        0        8      removed
        9     253        9        9      active sync   /dev/dm-9
       10     253       10       10      active sync   /dev/dm-10
       11     253       11       11      active sync   /dev/dm-11
       12     253       12       12      active sync   /dev/dm-12
       13     253       13       13      active sync   /dev/dm-13

        8     253        8        -      spare   /dev/dm-8


Previously however, it would come back with the "Number" as 14, not 8 as 
it should.  Shortly thereafter things got all out of whack, in addition 
to just not working properly :)  Now I've just got to figure out how to 
get the re-introduced drive to participate in the array again like it 
should.

Eli Stair wrote:
> 
> 
> I'm actually seeing similar behaviour on RAID10 (2.6.18), where after
> removing a drive from an array re-adding it sometimes results in it
> still being listed as a faulty-spare and not being "taken" for resync.
> In the same scenario, after swapping drives, doing a fail,remove, then
> an 'add' doesn't work, only a re-add will even get the drive listed by
> MDADM.
> 
> 
> What's the failure mode/symptoms that this patch is resolving?
> 
> Is it possible this affects the RAID10 module/mode as well?  If not,
> I'll start a new thread for that.  I'm testing this patch to see if it
> does remedy the situation on RAID10, and will update after some
> significant testing.
> 
> 
> /eli
> 
> 
> 
> 
> 
> 
> 
> 
> NeilBrown wrote:
>  > There is a nasty bug in md in 2.6.18 affecting at least raid1.
>  > This fixes it (and has already been sent to stable@kernel.org).
>  >
>  > ### Comments for Changeset
>  >
>  > This fixes a bug introduced in 2.6.18.
>  >
>  > If a drive is added to a raid1 using older tools (mdadm-1.x or
>  > raidtools) then it will be included in the array without any resync
>  > happening.
>  >
>  > It has been submitted for 2.6.18.1.
>  >
>  >
>  > Signed-off-by: Neil Brown <neilb@suse.de>
>  >
>  > ### Diffstat output
>  >  ./drivers/md/md.c |    1 +
>  >  1 file changed, 1 insertion(+)
>  >
>  > diff .prev/drivers/md/md.c ./drivers/md/md.c
>  > --- .prev/drivers/md/md.c       2006-09-29 11:51:39.000000000 +1000
>  > +++ ./drivers/md/md.c   2006-10-05 16:40:51.000000000 +1000
>  > @@ -3849,6 +3849,7 @@ static int hot_add_disk(mddev_t * mddev,
>  >         }
>  >         clear_bit(In_sync, &rdev->flags);
>  >         rdev->desc_nr = -1;
>  > +       rdev->saved_raid_disk = -1;
>  >         err = bind_rdev_to_array(rdev, mddev);
>  >         if (err)
>  >                 goto abort_export;
>  > -
>  > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>  > the body of a message to majordomo@vger.kernel.org
>  > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-06 22:42     ` Eli Stair
@ 2006-10-10  2:00       ` Neil Brown
  2006-10-10 20:42         ` Eli Stair
                           ` (2 more replies)
  2006-10-10 20:20       ` Eli Stair
  1 sibling, 3 replies; 11+ messages in thread
From: Neil Brown @ 2006-10-10  2:00 UTC (permalink / raw)
  To: Eli Stair; +Cc: linux-raid

On Friday October 6, estair@ilm.com wrote:
> 
> This patch has resolved the immediate issue I was having on 2.6.18 with 
> RAID10.  Previous to this change, after removing a device from the array 
> (with mdadm --remove), physically pulling the device and 
> changing/re-inserting, the "Number" of the new device would be 
> incremented on top of the highest-present device in the array.  Now, it 
> resumes its previous place.
> 
> Does this look to be 'correct' output for a 14-drive array, which dev 8 
> was failed/removed from then "add"'ed?  I'm trying to determine why the 
> device doesn't get pulled back into the active configuration and 
> re-synced.  Any comments?

Does this patch help?



Fix count of degraded drives in raid10.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c	2006-10-09 14:18:00.000000000 +1000
+++ ./drivers/md/raid10.c	2006-10-05 20:10:07.000000000 +1000
@@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
 		disk = conf->mirrors + i;
 
 		if (!disk->rdev ||
-		    !test_bit(In_sync, &rdev->flags)) {
+		    !test_bit(In_sync, &disk->rdev->flags)) {
 			disk->head_position = 0;
 			mddev->degraded++;
 		}


NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-06 22:42     ` Eli Stair
  2006-10-10  2:00       ` Neil Brown
@ 2006-10-10 20:20       ` Eli Stair
  1 sibling, 0 replies; 11+ messages in thread
From: Eli Stair @ 2006-10-10 20:20 UTC (permalink / raw)
  To: linux-raid


Looks like this issue isn't fully resolved after all, after spending 
some time trying to get the re-added drive to sync, I've removed and 
added it again.  This resulted in the previous behaviour I saw, losing 
its original numeric position, and becoming "14".

This now looks 100% repeatable, and appears to look like a race 
condition.  One item of note, is that if I build the array with a 
version 1.2 superblock, this mis-numbering behaviour seems to disappear 
(I've run through it five times since without recurrence).

Doing a single-command fail/remove fails the device but errors on removal:

[root@gtmp03 ~]# mdadm /dev/md0 --fail /dev/dm-13 --remove /dev/dm-13
mdadm: set /dev/dm-13 faulty in /dev/md0
mdadm: hot remove failed for /dev/dm-13: Device or resource busy





     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2     253        2        2      active sync   /dev/dm-2
        3     253        3        3      active sync   /dev/dm-3
        4     253        4        4      active sync   /dev/dm-4
        5     253        5        5      active sync   /dev/dm-5
        6     253        6        6      active sync   /dev/dm-6
        7     253        7        7      active sync   /dev/dm-7
        8       0        0        8      removed
        9     253        9        9      active sync   /dev/dm-9
       10     253       10       10      active sync   /dev/dm-10
       11     253       11       11      active sync   /dev/dm-11
       12     253       12       12      active sync   /dev/dm-12
       13     253       13       13      active sync   /dev/dm-13

       14     253        8        -      spare   /dev/dm-8



Eli Stair wrote:
> 
> This patch has resolved the immediate issue I was having on 2.6.18 with
> RAID10.  Previous to this change, after removing a device from the array
> (with mdadm --remove), physically pulling the device and
> changing/re-inserting, the "Number" of the new device would be
> incremented on top of the highest-present device in the array.  Now, it
> resumes its previous place.
> 
> Does this look to be 'correct' output for a 14-drive array, which dev 8
> was failed/removed from then "add"'ed?  I'm trying to determine why the
> device doesn't get pulled back into the active configuration and
> re-synced.  Any comments?
> 
> Thanks!
> 
> /eli
> 
> For example, currently when device dm-8 is removed it shows up like this:
> 
> 
> 
>      Number   Major   Minor   RaidDevice State
>         0     253        0        0      active sync   /dev/dm-0
>         1     253        1        1      active sync   /dev/dm-1
>         2     253        2        2      active sync   /dev/dm-2
>         3     253        3        3      active sync   /dev/dm-3
>         4     253        4        4      active sync   /dev/dm-4
>         5     253        5        5      active sync   /dev/dm-5
>         6     253        6        6      active sync   /dev/dm-6
>         7     253        7        7      active sync   /dev/dm-7
>         8       0        0        8      removed
>         9     253        9        9      active sync   /dev/dm-9
>        10     253       10       10      active sync   /dev/dm-10
>        11     253       11       11      active sync   /dev/dm-11
>        12     253       12       12      active sync   /dev/dm-12
>        13     253       13       13      active sync   /dev/dm-13
> 
>         8     253        8        -      spare   /dev/dm-8
> 
> 
> Previously however, it would come back with the "Number" as 14, not 8 as
> it should.  Shortly thereafter things got all out of whack, in addition
> to just not working properly :)  Now I've just got to figure out how to
> get the re-introduced drive to participate in the array again like it
> should.
> 
> Eli Stair wrote:
>  >
>  >
>  > I'm actually seeing similar behaviour on RAID10 (2.6.18), where after
>  > removing a drive from an array re-adding it sometimes results in it
>  > still being listed as a faulty-spare and not being "taken" for resync.
>  > In the same scenario, after swapping drives, doing a fail,remove, then
>  > an 'add' doesn't work, only a re-add will even get the drive listed by
>  > MDADM.
>  >
>  >
>  > What's the failure mode/symptoms that this patch is resolving?
>  >
>  > Is it possible this affects the RAID10 module/mode as well?  If not,
>  > I'll start a new thread for that.  I'm testing this patch to see if it
>  > does remedy the situation on RAID10, and will update after some
>  > significant testing.
>  >
>  >
>  > /eli
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  > NeilBrown wrote:
>  >  > There is a nasty bug in md in 2.6.18 affecting at least raid1.
>  >  > This fixes it (and has already been sent to stable@kernel.org).
>  >  >
>  >  > ### Comments for Changeset
>  >  >
>  >  > This fixes a bug introduced in 2.6.18.
>  >  >
>  >  > If a drive is added to a raid1 using older tools (mdadm-1.x or
>  >  > raidtools) then it will be included in the array without any resync
>  >  > happening.
>  >  >
>  >  > It has been submitted for 2.6.18.1.
>  >  >
>  >  >
>  >  > Signed-off-by: Neil Brown <neilb@suse.de>
>  >  >
>  >  > ### Diffstat output
>  >  >  ./drivers/md/md.c |    1 +
>  >  >  1 file changed, 1 insertion(+)
>  >  >
>  >  > diff .prev/drivers/md/md.c ./drivers/md/md.c
>  >  > --- .prev/drivers/md/md.c       2006-09-29 11:51:39.000000000 +1000
>  >  > +++ ./drivers/md/md.c   2006-10-05 16:40:51.000000000 +1000
>  >  > @@ -3849,6 +3849,7 @@ static int hot_add_disk(mddev_t * mddev,
>  >  >         }
>  >  >         clear_bit(In_sync, &rdev->flags);
>  >  >         rdev->desc_nr = -1;
>  >  > +       rdev->saved_raid_disk = -1;
>  >  >         err = bind_rdev_to_array(rdev, mddev);
>  >  >         if (err)
>  >  >                 goto abort_export;
>  >  > -
>  >  > To unsubscribe from this list: send the line "unsubscribe 
> linux-raid" in
>  >  > the body of a message to majordomo@vger.kernel.org
>  >  > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >  >
>  >
>  > -
>  > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>  > the body of a message to majordomo@vger.kernel.org
>  > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-10  2:00       ` Neil Brown
@ 2006-10-10 20:42         ` Eli Stair
  2006-10-11  0:00           ` Eli Stair
  2006-10-12 10:02         ` Michael Tokarev
  2006-10-18 22:18         ` Eli Stair
  2 siblings, 1 reply; 11+ messages in thread
From: Eli Stair @ 2006-10-10 20:42 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4550 bytes --]


Thanks Neil,

I just gave this patched module a shot on four systems.  So far, I 
haven't seen the device number inappropriately increment, though as per 
  a mail I sent a short while ago that seemed remedied by using the 1.2 
superblock, for some reason.  However, it appears to have introduced a 
new issue, and another is unresolved by it:



// BUG 1
The single-command syntax to fail and remove a drive is still failing, I 
do not know if this is somehow contributing to the further (new) issues 
below:

   [root@gtmp06 tmp]# mdadm /dev/md0 --fail /dev/dm-0 --remove /dev/dm-0
   mdadm: set /dev/dm-0 faulty in /dev/md0
   mdadm: hot remove failed for /dev/dm-0: Device or resource busy

   [root@gtmp06 tmp]# mdadm /dev/md0 --remove /dev/dm-0
   mdadm: hot removed /dev/dm-0


// BUG 2
Now, upon adding or re-adding a "fail...remove"'d drive, it is not used 
for resync.  I realized previously that added drives weren't re-synced 
until the existing array build was done, then they were grabbed.  This 
however is a clean/active array that is rejecting the drive.

I've performed this identically on both a clean & active array, as well 
as a newly-created (resync'ing) array, to the same effect.  Even after 
rebuild or reboot, the removed drive isn't taken back and remains listed 
as a "faulty spare", with dmesg indicating that it is "non-fresh".




// DMESG:

md: kicking non-fresh dm-0 from array!


// ARRAY status 'mdadm -D /dev/md0'

           State : active, degraded
  Active Devices : 13
Working Devices : 13
  Failed Devices : 1
   Spare Devices : 0

          Layout : near=1, offset=2
      Chunk Size : 512K

            Name : 0
            UUID : 05c2faf4:facfcad3:ba33b140:100f428a
          Events : 22

     Number   Major   Minor   RaidDevice State
        0     253        1        0      active sync   /dev/dm-1
        1     253        2        1      active sync   /dev/dm-2
        2     253        5        2      active sync   /dev/dm-5
        3     253        4        3      active sync   /dev/dm-4
        4     253        6        4      active sync   /dev/dm-6
        5     253        3        5      active sync   /dev/dm-3
        6     253       13        6      active sync   /dev/dm-13
        7       0        0        7      removed
        8     253        7        8      active sync   /dev/dm-7
        9     253        8        9      active sync   /dev/dm-8
       10     253        9       10      active sync   /dev/dm-9
       11     253       11       11      active sync   /dev/dm-11
       12     253       10       12      active sync   /dev/dm-10
       13     253       12       13      active sync   /dev/dm-12

        7     253        0        -      faulty spare   /dev/dm-0




Let me know what more I can do to help track this down.  I'm reverting 
this patch, since it is behaving less-well than before.  Will be happy 
to try others.

Attached are typescript of the drive remove/add sessions and all output.


/eli


Neil Brown wrote:
> On Friday October 6, estair@ilm.com wrote:
>  >
>  > This patch has resolved the immediate issue I was having on 2.6.18 with
>  > RAID10.  Previous to this change, after removing a device from the array
>  > (with mdadm --remove), physically pulling the device and
>  > changing/re-inserting, the "Number" of the new device would be
>  > incremented on top of the highest-present device in the array.  Now, it
>  > resumes its previous place.
>  >
>  > Does this look to be 'correct' output for a 14-drive array, which dev 8
>  > was failed/removed from then "add"'ed?  I'm trying to determine why the
>  > device doesn't get pulled back into the active configuration and
>  > re-synced.  Any comments?
> 
> Does this patch help?
> 
> 
> 
> Fix count of degraded drives in raid10.
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./drivers/md/raid10.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
> --- .prev/drivers/md/raid10.c   2006-10-09 14:18:00.000000000 +1000
> +++ ./drivers/md/raid10.c       2006-10-05 20:10:07.000000000 +1000
> @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
>                 disk = conf->mirrors + i;
> 
>                 if (!disk->rdev ||
> -                   !test_bit(In_sync, &rdev->flags)) {
> +                   !test_bit(In_sync, &disk->rdev->flags)) {
>                         disk->head_position = 0;
>                         mddev->degraded++;
>                 }
> 
> 
> NeilBrown
> 


[-- Attachment #2: gtmp-mdadm-add-drive-after-boot-to-degraded-array-fails-to-resync.log.gz --]
[-- Type: application/x-gzip, Size: 1349 bytes --]

[-- Attachment #3: gtmp-mdadm-remove-drive-from-activeclean-array-add-fails.log.gz --]
[-- Type: application/x-gzip, Size: 11331 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-10 20:42         ` Eli Stair
@ 2006-10-11  0:00           ` Eli Stair
  0 siblings, 0 replies; 11+ messages in thread
From: Eli Stair @ 2006-10-11  0:00 UTC (permalink / raw)
  To: linux-raid; +Cc: Neil Brown

[-- Attachment #1: Type: text/plain, Size: 6142 bytes --]


In testing this some more, I've determined that (always with this 
raid10.c patch, sometimes without) the kernel is not recognizing 
marked-faulty drives when they're added back to the array.  It appears 
to be some bit that is flagged and (I assume) normally cleared when that 
drive is re-added as an array member.

If I zero the device (I'm assuming it's the wiping of the mdadm 
superblock), it will be marked upon issuing 'mdadm /dev/md0 -a 
/dev/dm-0' as "spare" instead of "faulty-spare".  This behaviour has 
been erratic for a while, and I'm not sure if I'm seeing a bug or if I 
am working under the wrong presumption with inappropriate actions on my 
part.

When a drive is either manually marked "failed" or is automatically 
tagged during a failure, is the expected user action to zero the 
(original or replacement) drive before doing an 'add'?  Should the 
kernel recognize that the drive was removed, and that the 'add' should 
clear any "faulty" or "failed" state?


/eli

PS - In the process of figuring out when this occurs and how to work 
around it, I just hacked up this shell script that takes care of 
removing the device from the array, zeroing it, re-reading/scanning the 
disk and adding it back in, depending on the function that is called.




Eli Stair wrote:
> 
> Thanks Neil,
> 
> I just gave this patched module a shot on four systems.  So far, I
> haven't seen the device number inappropriately increment, though as per
>   a mail I sent a short while ago that seemed remedied by using the 1.2
> superblock, for some reason.  However, it appears to have introduced a
> new issue, and another is unresolved by it:
> 
> 
> 
> // BUG 1
> The single-command syntax to fail and remove a drive is still failing, I
> do not know if this is somehow contributing to the further (new) issues
> below:
> 
>    [root@gtmp06 tmp]# mdadm /dev/md0 --fail /dev/dm-0 --remove /dev/dm-0
>    mdadm: set /dev/dm-0 faulty in /dev/md0
>    mdadm: hot remove failed for /dev/dm-0: Device or resource busy
> 
>    [root@gtmp06 tmp]# mdadm /dev/md0 --remove /dev/dm-0
>    mdadm: hot removed /dev/dm-0
> 
> 
> // BUG 2
> Now, upon adding or re-adding a "fail...remove"'d drive, it is not used
> for resync.  I realized previously that added drives weren't re-synced
> until the existing array build was done, then they were grabbed.  This
> however is a clean/active array that is rejecting the drive.
> 
> I've performed this identically on both a clean & active array, as well
> as a newly-created (resync'ing) array, to the same effect.  Even after
> rebuild or reboot, the removed drive isn't taken back and remains listed
> as a "faulty spare", with dmesg indicating that it is "non-fresh".
> 
> 
> 
> 
> // DMESG:
> 
> md: kicking non-fresh dm-0 from array!
> 
> 
> // ARRAY status 'mdadm -D /dev/md0'
> 
>            State : active, degraded
>   Active Devices : 13
> Working Devices : 13
>   Failed Devices : 1
>    Spare Devices : 0
> 
>           Layout : near=1, offset=2
>       Chunk Size : 512K
> 
>             Name : 0
>             UUID : 05c2faf4:facfcad3:ba33b140:100f428a
>           Events : 22
> 
>      Number   Major   Minor   RaidDevice State
>         0     253        1        0      active sync   /dev/dm-1
>         1     253        2        1      active sync   /dev/dm-2
>         2     253        5        2      active sync   /dev/dm-5
>         3     253        4        3      active sync   /dev/dm-4
>         4     253        6        4      active sync   /dev/dm-6
>         5     253        3        5      active sync   /dev/dm-3
>         6     253       13        6      active sync   /dev/dm-13
>         7       0        0        7      removed
>         8     253        7        8      active sync   /dev/dm-7
>         9     253        8        9      active sync   /dev/dm-8
>        10     253        9       10      active sync   /dev/dm-9
>        11     253       11       11      active sync   /dev/dm-11
>        12     253       10       12      active sync   /dev/dm-10
>        13     253       12       13      active sync   /dev/dm-12
> 
>         7     253        0        -      faulty spare   /dev/dm-0
> 
> 
> 
> 
> Let me know what more I can do to help track this down.  I'm reverting
> this patch, since it is behaving less-well than before.  Will be happy
> to try others.
> 
> Attached are typescript of the drive remove/add sessions and all output.
> 
> 
> /eli
> 
> 
> Neil Brown wrote:
>  > On Friday October 6, estair@ilm.com wrote:
>  >  >
>  >  > This patch has resolved the immediate issue I was having on 2.6.18 
> with
>  >  > RAID10.  Previous to this change, after removing a device from the 
> array
>  >  > (with mdadm --remove), physically pulling the device and
>  >  > changing/re-inserting, the "Number" of the new device would be
>  >  > incremented on top of the highest-present device in the array.  
> Now, it
>  >  > resumes its previous place.
>  >  >
>  >  > Does this look to be 'correct' output for a 14-drive array, which 
> dev 8
>  >  > was failed/removed from then "add"'ed?  I'm trying to determine 
> why the
>  >  > device doesn't get pulled back into the active configuration and
>  >  > re-synced.  Any comments?
>  >
>  > Does this patch help?
>  >
>  >
>  >
>  > Fix count of degraded drives in raid10.
>  >
>  >
>  > Signed-off-by: Neil Brown <neilb@suse.de>
>  >
>  > ### Diffstat output
>  >  ./drivers/md/raid10.c |    2 +-
>  >  1 file changed, 1 insertion(+), 1 deletion(-)
>  >
>  > diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
>  > --- .prev/drivers/md/raid10.c   2006-10-09 14:18:00.000000000 +1000
>  > +++ ./drivers/md/raid10.c       2006-10-05 20:10:07.000000000 +1000
>  > @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
>  >                 disk = conf->mirrors + i;
>  >
>  >                 if (!disk->rdev ||
>  > -                   !test_bit(In_sync, &rdev->flags)) {
>  > +                   !test_bit(In_sync, &disk->rdev->flags)) {
>  >                         disk->head_position = 0;
>  >                         mddev->degraded++;
>  >                 }
>  >
>  >
>  > NeilBrown
>  >
> 


[-- Attachment #2: mdadm-replace-drive.sh --]
[-- Type: text/plain, Size: 2778 bytes --]

#!/bin/sh

MODE=$1
ARRAY=$2
DRIVE=$3

alias logger="logger -s -t mdadm_replace"

function DISK_REMOVE {

  mdadm -D $ARRAY | grep -E "${DRIVE}$" > /dev/null
  MD_DEV_PRESENT="$?"
  if [ "$MD_DEV_PRESENT" == "0" ] ; then
    echo "// SETTING DRIVE($DRIVE) STATE TO FAULTY "
    echo mdadm $ARRAY -f $DRIVE
    mdadm $ARRAY -f $DRIVE
    sleep 5
    echo "// REMOVING DRIVE($DRIVE) FROM ARRAY($ARRAY) "
    echo mdadm $ARRAY -r $DRIVE
    mdadm $ARRAY -r $DRIVE
    MD_DEV_REMOVE="$?"
    if [ "$MD_DEV_REMOVE" != "0" ] ; then
      echo "// DEVICE ($DRIVE) FAILED TO REMOVE, EXITING UNCLEANLY! "
      exit 1
    fi

  else
    echo "// DEVICE ($DRIVE) NOT LISTED, EXITING (PERHAPS REMOVED ALREADY...) "
    return 1
  fi

} #/FUNCTION DISK_REMOVE

function DISK_READMIT {

  echo "// ZEROING DRIVE BEFORE ADMITTING BACK TO ARRAY "
  # set 1MB blocksize for 'dd'
  BS="1048576"
  # get blockdev size in bytes:
  DISK_BYTES=`fdisk -l ${DRIVE} 2>/dev/null | grep bytes | head -1 | awk '{print $5}'`
  # calculate for DD, to write at offset 64MB from the end of device -> end of device
  (( DISK_BYTES_OFFSET=( $DISK_BYTES / $BS ) - 64 ))

  # zero 64MB at start of drive
  dd if=/dev/zero of=${DRIVE} bs=1M count=64
  DD_ERR=$?
  # check and make sure we are writing to the drive, else exit with an error
  if [ "${DD_ERR}" != "0" ] ; then
    echo "COULD NOT ZERO DEVICE ${DRIVE}, EXITING WITH ERROR ${DD_ERR} "
    exit 1
  fi
  # zero drive starting at offset 64MB from end until it hits the wall
  dd if=/dev/zero of=${DRIVE} bs=1M seek=$DISK_BYTES_OFFSET

  # hackey-like reload of device now that it's been b0rked:
  echo "// RE-LOADING DRIVE BY KERNEL "
  MASTER=`basename $DRIVE`
  for SLAVE in /sys/block/$MASTER/slaves/* ; do 
    #echo "SLAVE is $SLAVE "
    RDEV=`basename $SLAVE`
    echo 1 > $SLAVE/device/rescan 
    sleep 1
    blockdev --rereadpt /dev/$RDEV >/dev/null 2>&1
  done
  blockdev --rereadpt $DRIVE >/dev/null 2>&1

  # sleep a few to let udev/kernel catch up
  sleep 10

  # add device back to array now:
  echo "// mdadm $ARRAY -a $DRIVE"
  mdadm $ARRAY -a $DRIVE
  MD_ADD_ERR="$?"
  if [ "${MD_ADD_ERR}" != "0" ] ; then
    echo "COULD NOT ADD DEVICE($DRIVE) BACK TO ARRAY($ARRAY), EXITING WITH ERROR ${DD_ERR} "
    exit 1
  fi

  # array now has the replacement drive.  show status and exit:
  echo -e "\n#### ARRAY $ARRAY STATUS: " | logger
  cat /proc/mdstat | logger
  echo -e "\n#### DEVICE $DRIVE STATUS: " | logger
  mdadm --examine $DRIVE | logger
  exit 0

} #/FUNCTION DISK_READMIT


### END SETUP.
### START PROGRAM FLOW:

if [[ (( x"${ARRAY}" == "x" )) &&  (( x"${DRIVE}" == "x" )) ]] ; then

  ### necessary input vals not defined, exit.
  echo "VALUES NOT SET "
  exit 1

else

  ### ALL CLEAR, run program
  # function
  $MODE
  
fi


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-10  2:00       ` Neil Brown
  2006-10-10 20:42         ` Eli Stair
@ 2006-10-12 10:02         ` Michael Tokarev
  2006-10-17  0:13           ` Neil Brown
  2006-10-18 22:18         ` Eli Stair
  2 siblings, 1 reply; 11+ messages in thread
From: Michael Tokarev @ 2006-10-12 10:02 UTC (permalink / raw)
  To: Neil Brown; +Cc: Eli Stair, linux-raid

Neil Brown wrote:
[]
> Fix count of degraded drives in raid10.
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> --- .prev/drivers/md/raid10.c	2006-10-09 14:18:00.000000000 +1000
> +++ ./drivers/md/raid10.c	2006-10-05 20:10:07.000000000 +1000
> @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
>  		disk = conf->mirrors + i;
>  
>  		if (!disk->rdev ||
> -		    !test_bit(In_sync, &rdev->flags)) {
> +		    !test_bit(In_sync, &disk->rdev->flags)) {
>  			disk->head_position = 0;
>  			mddev->degraded++;
>  		}

Neil, this makes me nervous.  Seriously.

How many bugs like this has been fixed so far? 10? 50?  I stopped counting
long time ago.  And it's the same thing in every case - misuse of rdev vs
disk->rdev.  The same pattern.

I wonder if it can be avoided in the first place somehow - maybe don't
declare and use local variable `rdev' (not by name, but by the semantics
of it), and always use disk->rdev or mddev->whatever in every place,
explicitly, and let the compiler optimize the deref if possible?

And btw, this is another 2.6.18.1 candidate (if it's not too late already).

Thanks.

/mjt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-12 10:02         ` Michael Tokarev
@ 2006-10-17  0:13           ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2006-10-17  0:13 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Eli Stair, linux-raid

On Thursday October 12, mjt@tls.msk.ru wrote:
> Neil Brown wrote:
> []
> > Fix count of degraded drives in raid10.
> > 
> > Signed-off-by: Neil Brown <neilb@suse.de>
> > 
> > --- .prev/drivers/md/raid10.c	2006-10-09 14:18:00.000000000 +1000
> > +++ ./drivers/md/raid10.c	2006-10-05 20:10:07.000000000 +1000
> > @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
> >  		disk = conf->mirrors + i;
> >  
> >  		if (!disk->rdev ||
> > -		    !test_bit(In_sync, &rdev->flags)) {
> > +		    !test_bit(In_sync, &disk->rdev->flags)) {
> >  			disk->head_position = 0;
> >  			mddev->degraded++;
> >  		}
> 
> Neil, this makes me nervous.  Seriously.

Yes.  Bugs are a problem.

> 
> How many bugs like this has been fixed so far? 10? 50?  I stopped counting
> long time ago.  And it's the same thing in every case - misuse of rdev vs
> disk->rdev.  The same pattern.

I really don't think there have been that many that follow that
pattern that closely. Maybe 2 or 3.

> 
> I wonder if it can be avoided in the first place somehow - maybe don't
> declare and use local variable `rdev' (not by name, but by the semantics
> of it), and always use disk->rdev or mddev->whatever in every place,
> explicitly, and let the compiler optimize the deref if possible?
> 

There certainly are styles of programming and rules for choosing names
that can help reduce bugs.  And the kernel style does encourage some
good practices.
But that won't be enough by itself.  We need good style, and a review
process, and testing.  And still bugs will get through, but there
should be fewer.  You are welcome to help with any of these.

I hope to set up a more structured testing system soon with should be
able to catch this sort of bug at least.

> And btw, this is another 2.6.18.1 candidate (if it's not too late already).

Yes, it was too late for 2.6.18.1.  I'll submit it for 2.6.18.2.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-10  2:00       ` Neil Brown
  2006-10-10 20:42         ` Eli Stair
  2006-10-12 10:02         ` Michael Tokarev
@ 2006-10-18 22:18         ` Eli Stair
  2006-10-20  3:24           ` Neil Brown
  2 siblings, 1 reply; 11+ messages in thread
From: Eli Stair @ 2006-10-18 22:18 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid


FYI, I'm testing 2.6.18.1 and noticed this mis-numbering of RAID10
members issue is still extant.  Even with this fix applied to raid10.c, 
I am still seeing repeatable issues with devices assuming a "Number" 
greater than that which they had when removed from a running array.

Issue 1)

I'm seeing inconsistencies in the way a drive is marked (and its 
behaviour) during rebuild after it is removed and added.  In this 
instance, the re-added drive is picked up and marked as "spare 
rebuilding".

  Rebuild Status : 20% complete

            Name : 0
            UUID : ab764369:7cf80f2b:cf61b6df:0b13cd3a
          Events : 1

     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2     253       10        2      active sync   /dev/dm-10
        3     253       11        3      active sync   /dev/dm-11
        4     253       12        4      active sync   /dev/dm-12
        5     253       13        5      active sync   /dev/dm-13
        6     253        2        6      active sync   /dev/dm-2
        7     253        3        7      active sync   /dev/dm-3
        8     253        4        8      active sync   /dev/dm-4
        9     253        5        9      active sync   /dev/dm-5
       10     253        6       10      active sync   /dev/dm-6
       11     253        7       11      active sync   /dev/dm-7
       12     253        8       12      active sync   /dev/dm-8
       13     253        9       13      active sync   /dev/dm-9
[root@gtmp02 ~]# cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 dm-9[13] dm-8[12] dm-7[11] dm-6[10] dm-5[9] dm-4[8] 
dm-3[7] dm-2[6] dm-13[5] dm-12[4] dm-11[3] dm-10[2] dm-1[1] dm-0[0]
       1003620352 blocks super 1.2 512K chunks 2 offset-copies [14/14] 
[UUUUUUUUUUUUUU]
       [====>................]  resync = 21.7% (218664064/1003620352) 
finish=114.1min speed=114596K/sec




However, on the same configuration, it occasionally is pulled right back 
with a state of "active sync", without indication that it dirty:

Issue 2)

When a device is removed and subsequently added again (after setting 
failed and removing from the array), it SHOULD be set back to the 
"Number" it originally had in the array correct?  In the cases when the 
drive is NOT automatically marked as "active sync" and all members show 
up fine, it is picked up as a spare and rebuild is started, during which 
time it is marked down "_" in the /proc/mdstat date, and "spare 
rebuilding" in mdadm -D output:



When device "Number" 10


// STATE WHEN CLEAN:
            UUID : 6ccd7974:1b23f5b2:047d1560:b5922692

     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2     253       10        2      active sync   /dev/dm-10
        3     253       11        3      active sync   /dev/dm-11
        4     253       12        4      active sync   /dev/dm-12
        5     253       13        5      active sync   /dev/dm-13
        6     253        2        6      active sync   /dev/dm-2
        7     253        3        7      active sync   /dev/dm-3
        8     253        4        8      active sync   /dev/dm-4
        9     253        5        9      active sync   /dev/dm-5
       10     253        6       10      active sync   /dev/dm-6
       11     253        7       11      active sync   /dev/dm-7
       12     253        8       12      active sync   /dev/dm-8
       13     253        9       13      active sync   /dev/dm-9


// STATE AFTER FAILURE:
     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2       0        0        2      removed
        3     253       11        3      active sync   /dev/dm-11
        4     253       12        4      active sync   /dev/dm-12
        5     253       13        5      active sync   /dev/dm-13
        6     253        2        6      active sync   /dev/dm-2
        7     253        3        7      active sync   /dev/dm-3
        8     253        4        8      active sync   /dev/dm-4
        9     253        5        9      active sync   /dev/dm-5
       10     253        6       10      active sync   /dev/dm-6
       11     253        7       11      active sync   /dev/dm-7
       12     253        8       12      active sync   /dev/dm-8
       13     253        9       13      active sync   /dev/dm-9

        2     253       10        -      faulty spare   /dev/dm-10

// STATE AFTER REMOVAL:
     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
        2       0        0        2      removed
        3     253       11        3      active sync   /dev/dm-11
        4     253       12        4      active sync   /dev/dm-12
        5     253       13        5      active sync   /dev/dm-13
        6     253        2        6      active sync   /dev/dm-2
        7     253        3        7      active sync   /dev/dm-3
        8     253        4        8      active sync   /dev/dm-4
        9     253        5        9      active sync   /dev/dm-5
       10     253        6       10      active sync   /dev/dm-6
       11     253        7       11      active sync   /dev/dm-7
       12     253        8       12      active sync   /dev/dm-8
       13     253        9       13      active sync   /dev/dm-9

// STATE AFTER RE-ADD:
     Number   Major   Minor   RaidDevice State
        0     253        0        0      active sync   /dev/dm-0
        1     253        1        1      active sync   /dev/dm-1
       14     253       10        2      spare rebuilding   /dev/dm-10
        3     253       11        3      active sync   /dev/dm-11
        4     253       12        4      active sync   /dev/dm-12
        5     253       13        5      active sync   /dev/dm-13
        6     253        2        6      active sync   /dev/dm-2
        7     253        3        7      active sync   /dev/dm-3
        8     253        4        8      active sync   /dev/dm-4
        9     253        5        9      active sync   /dev/dm-5
       10     253        6       10      active sync   /dev/dm-6
       11     253        7       11      active sync   /dev/dm-7
       12     253        8       12      active sync   /dev/dm-8
       13     253        9       13      active sync   /dev/dm-9







/eli


// raid10.c:

         for (i = 0; i < conf->raid_disks; i++) {

                 disk = conf->mirrors + i;

                 if (!disk->rdev ||
                     !test_bit(In_sync, &rdev->flags)) {
                         disk->head_position = 0;
                         mddev->degraded++;
                 }
         }

// END raid10.c




Neil Brown wrote:
> On Friday October 6, estair@ilm.com wrote:
>  >
>  > This patch has resolved the immediate issue I was having on 2.6.18 with
>  > RAID10.  Previous to this change, after removing a device from the array
>  > (with mdadm --remove), physically pulling the device and
>  > changing/re-inserting, the "Number" of the new device would be
>  > incremented on top of the highest-present device in the array.  Now, it
>  > resumes its previous place.
>  >
>  > Does this look to be 'correct' output for a 14-drive array, which dev 8
>  > was failed/removed from then "add"'ed?  I'm trying to determine why the
>  > device doesn't get pulled back into the active configuration and
>  > re-synced.  Any comments?
> 
> Does this patch help?
> 
> 
> 
> Fix count of degraded drives in raid10.
> 
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
> 
> ### Diffstat output
>  ./drivers/md/raid10.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
> --- .prev/drivers/md/raid10.c   2006-10-09 14:18:00.000000000 +1000
> +++ ./drivers/md/raid10.c       2006-10-05 20:10:07.000000000 +1000
> @@ -2079,7 +2079,7 @@ static int run(mddev_t *mddev)
>                 disk = conf->mirrors + i;
> 
>                 if (!disk->rdev ||
> -                   !test_bit(In_sync, &rdev->flags)) {
> +                   !test_bit(In_sync, &disk->rdev->flags)) {
>                         disk->head_position = 0;
>                         mddev->degraded++;
>                 }
> 
> 
> NeilBrown
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.
  2006-10-18 22:18         ` Eli Stair
@ 2006-10-20  3:24           ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2006-10-20  3:24 UTC (permalink / raw)
  To: Eli Stair; +Cc: linux-raid

On Wednesday October 18, estair@ilm.com wrote:
> 
> FYI, I'm testing 2.6.18.1 and noticed this mis-numbering of RAID10
> members issue is still extant.  Even with this fix applied to raid10.c, 
> I am still seeing repeatable issues with devices assuming a "Number" 
> greater than that which they had when removed from a running array.

That 'Number' change isn't a problem.  It is really just an arbitrary
label to track a device.  Once you fail a device and re-add it, md
considers it to be a new device.
As long as the 'RaidDevice' number is what you expect, everything is
working properly.

NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2006-10-20  3:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20061005171233.6542.patches@notabene>
2006-10-05  7:13 ` [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly NeilBrown
2006-10-05 19:26   ` Eli Stair
2006-10-06 22:42     ` Eli Stair
2006-10-10  2:00       ` Neil Brown
2006-10-10 20:42         ` Eli Stair
2006-10-11  0:00           ` Eli Stair
2006-10-12 10:02         ` Michael Tokarev
2006-10-17  0:13           ` Neil Brown
2006-10-18 22:18         ` Eli Stair
2006-10-20  3:24           ` Neil Brown
2006-10-10 20:20       ` Eli Stair

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).