Linux RAID subsystem development
 help / color / mirror / Atom feed
* raid10 - won't rebuild - assigns all added disks as spares
@ 2014-11-25  1:49 Jonathan Molyneux
  2014-11-25  2:28 ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Molyneux @ 2014-11-25  1:49 UTC (permalink / raw)
  To: linux RAID

Hi Everyone,

Have a strange situation that hasn't happened before.
Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
Have a raid10 that runs the server (boot's off a raid1) that after 
replacing a failed disk, just won't rebuild.

This is what it looks like without the disk (failed & removed):
md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
       1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
       bitmap: 8/15 pages [32KB], 65536KB chunk

Then when the disk is added:
md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
       1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
       bitmap: 8/15 pages [32KB], 65536KB chunk

Nothing unusual is being spat out in dmesg.
When removing the disk:
[313434.073997] md: unbind<sdd2>
[313434.138307] md: export_rdev(sdd2)
When adding the disk:
[313468.056484] md: bind<sdd2>

This is a strange one that I haven't had before.
Any thoughts on how to kick the rebuild off without needing a reboot ?

PS

Rebooting the server is an option, just would require some scheduling.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: raid10 - won't rebuild - assigns all added disks as spares
  2014-11-25  1:49 raid10 - won't rebuild - assigns all added disks as spares Jonathan Molyneux
@ 2014-11-25  2:28 ` NeilBrown
  2014-11-25  3:44   ` Jonathan Molyneux
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2014-11-25  2:28 UTC (permalink / raw)
  To: Jonathan Molyneux; +Cc: linux RAID

[-- Attachment #1: Type: text/plain, Size: 1788 bytes --]

On Tue, 25 Nov 2014 12:49:12 +1100 Jonathan Molyneux
<jonathan@infinitedepth.com.au> wrote:

> Hi Everyone,
> 
> Have a strange situation that hasn't happened before.
> Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
> Have a raid10 that runs the server (boot's off a raid1) that after 
> replacing a failed disk, just won't rebuild.
> 
> This is what it looks like without the disk (failed & removed):
> md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
>        1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>        bitmap: 8/15 pages [32KB], 65536KB chunk
> 
> Then when the disk is added:
> md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
>        1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>        bitmap: 8/15 pages [32KB], 65536KB chunk
> 
> Nothing unusual is being spat out in dmesg.
> When removing the disk:
> [313434.073997] md: unbind<sdd2>
> [313434.138307] md: export_rdev(sdd2)
> When adding the disk:
> [313468.056484] md: bind<sdd2>
> 
> This is a strange one that I haven't had before.
> Any thoughts on how to kick the rebuild off without needing a reboot ?

I'm sure I've seen this bug before... and fixed it.
I don't remember the details and cannot find anything obvious in change logs.

You could try

   echo recover > /sys/block/md1/md/sync_action

Alternately, if you are re-adding a disk that had just been removed, you could

   mdadm /dev/md1 --remove /dev/sdd2
   mdadm --zero /dev/sdd2
   mdadm /dev/md1 --add /dev/sdd2

that will force a full recovery instead of just a bitmap-based recovery.
That will of course take longer than a bitmap-based recover, but seeing the
bitmap based recovery isn't starting, that could still be an improvement.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: raid10 - won't rebuild - assigns all added disks as spares
  2014-11-25  2:28 ` NeilBrown
@ 2014-11-25  3:44   ` Jonathan Molyneux
  0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Molyneux @ 2014-11-25  3:44 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux RAID

Thanks Neil,

 > echo recover > /sys/block/md1/md/sync_action

That did the trick.

Regards
Jonathan

On 25/11/2014 1:28 PM, NeilBrown wrote:
> On Tue, 25 Nov 2014 12:49:12 +1100 Jonathan Molyneux
> <jonathan@infinitedepth.com.au> wrote:
>
>> Hi Everyone,
>>
>> Have a strange situation that hasn't happened before.
>> Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
>> Have a raid10 that runs the server (boot's off a raid1) that after
>> replacing a failed disk, just won't rebuild.
>>
>> This is what it looks like without the disk (failed & removed):
>> md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
>>         1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>>         bitmap: 8/15 pages [32KB], 65536KB chunk
>>
>> Then when the disk is added:
>> md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
>>         1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>>         bitmap: 8/15 pages [32KB], 65536KB chunk
>>
>> Nothing unusual is being spat out in dmesg.
>> When removing the disk:
>> [313434.073997] md: unbind<sdd2>
>> [313434.138307] md: export_rdev(sdd2)
>> When adding the disk:
>> [313468.056484] md: bind<sdd2>
>>
>> This is a strange one that I haven't had before.
>> Any thoughts on how to kick the rebuild off without needing a reboot ?
> I'm sure I've seen this bug before... and fixed it.
> I don't remember the details and cannot find anything obvious in change logs.
>
> You could try
>
>     echo recover > /sys/block/md1/md/sync_action
>
> Alternately, if you are re-adding a disk that had just been removed, you could
>
>     mdadm /dev/md1 --remove /dev/sdd2
>     mdadm --zero /dev/sdd2
>     mdadm /dev/md1 --add /dev/sdd2
>
> that will force a full recovery instead of just a bitmap-based recovery.
> That will of course take longer than a bitmap-based recover, but seeing the
> bitmap based recovery isn't starting, that could still be an improvement.
>
> NeilBrown


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-11-25  3:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-25  1:49 raid10 - won't rebuild - assigns all added disks as spares Jonathan Molyneux
2014-11-25  2:28 ` NeilBrown
2014-11-25  3:44   ` Jonathan Molyneux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox