* raid10 - won't rebuild - assigns all added disks as spares
@ 2014-11-25 1:49 Jonathan Molyneux
2014-11-25 2:28 ` NeilBrown
0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Molyneux @ 2014-11-25 1:49 UTC (permalink / raw)
To: linux RAID
Hi Everyone,
Have a strange situation that hasn't happened before.
Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
Have a raid10 that runs the server (boot's off a raid1) that after
replacing a failed disk, just won't rebuild.
This is what it looks like without the disk (failed & removed):
md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
bitmap: 8/15 pages [32KB], 65536KB chunk
Then when the disk is added:
md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
bitmap: 8/15 pages [32KB], 65536KB chunk
Nothing unusual is being spat out in dmesg.
When removing the disk:
[313434.073997] md: unbind<sdd2>
[313434.138307] md: export_rdev(sdd2)
When adding the disk:
[313468.056484] md: bind<sdd2>
This is a strange one that I haven't had before.
Any thoughts on how to kick the rebuild off without needing a reboot ?
PS
Rebooting the server is an option, just would require some scheduling.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid10 - won't rebuild - assigns all added disks as spares
2014-11-25 1:49 raid10 - won't rebuild - assigns all added disks as spares Jonathan Molyneux
@ 2014-11-25 2:28 ` NeilBrown
2014-11-25 3:44 ` Jonathan Molyneux
0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2014-11-25 2:28 UTC (permalink / raw)
To: Jonathan Molyneux; +Cc: linux RAID
[-- Attachment #1: Type: text/plain, Size: 1788 bytes --]
On Tue, 25 Nov 2014 12:49:12 +1100 Jonathan Molyneux
<jonathan@infinitedepth.com.au> wrote:
> Hi Everyone,
>
> Have a strange situation that hasn't happened before.
> Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
> Have a raid10 that runs the server (boot's off a raid1) that after
> replacing a failed disk, just won't rebuild.
>
> This is what it looks like without the disk (failed & removed):
> md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
> 1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
> bitmap: 8/15 pages [32KB], 65536KB chunk
>
> Then when the disk is added:
> md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
> 1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
> bitmap: 8/15 pages [32KB], 65536KB chunk
>
> Nothing unusual is being spat out in dmesg.
> When removing the disk:
> [313434.073997] md: unbind<sdd2>
> [313434.138307] md: export_rdev(sdd2)
> When adding the disk:
> [313468.056484] md: bind<sdd2>
>
> This is a strange one that I haven't had before.
> Any thoughts on how to kick the rebuild off without needing a reboot ?
I'm sure I've seen this bug before... and fixed it.
I don't remember the details and cannot find anything obvious in change logs.
You could try
echo recover > /sys/block/md1/md/sync_action
Alternately, if you are re-adding a disk that had just been removed, you could
mdadm /dev/md1 --remove /dev/sdd2
mdadm --zero /dev/sdd2
mdadm /dev/md1 --add /dev/sdd2
that will force a full recovery instead of just a bitmap-based recovery.
That will of course take longer than a bitmap-based recover, but seeing the
bitmap based recovery isn't starting, that could still be an improvement.
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid10 - won't rebuild - assigns all added disks as spares
2014-11-25 2:28 ` NeilBrown
@ 2014-11-25 3:44 ` Jonathan Molyneux
0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Molyneux @ 2014-11-25 3:44 UTC (permalink / raw)
To: NeilBrown; +Cc: linux RAID
Thanks Neil,
> echo recover > /sys/block/md1/md/sync_action
That did the trick.
Regards
Jonathan
On 25/11/2014 1:28 PM, NeilBrown wrote:
> On Tue, 25 Nov 2014 12:49:12 +1100 Jonathan Molyneux
> <jonathan@infinitedepth.com.au> wrote:
>
>> Hi Everyone,
>>
>> Have a strange situation that hasn't happened before.
>> Running Debian 7.7 with kernel version 3.2.63-2+deb7u1.
>> Have a raid10 that runs the server (boot's off a raid1) that after
>> replacing a failed disk, just won't rebuild.
>>
>> This is what it looks like without the disk (failed & removed):
>> md1 : active raid10 sda2[6] sdc2[4] sdb2[1]
>> 1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>> bitmap: 8/15 pages [32KB], 65536KB chunk
>>
>> Then when the disk is added:
>> md1 : active raid10 sdd2[5](S) sda2[6] sdc2[4] sdb2[1]
>> 1952987136 blocks super 1.2 512K chunks 2 far-copies [4/3] [UUU_]
>> bitmap: 8/15 pages [32KB], 65536KB chunk
>>
>> Nothing unusual is being spat out in dmesg.
>> When removing the disk:
>> [313434.073997] md: unbind<sdd2>
>> [313434.138307] md: export_rdev(sdd2)
>> When adding the disk:
>> [313468.056484] md: bind<sdd2>
>>
>> This is a strange one that I haven't had before.
>> Any thoughts on how to kick the rebuild off without needing a reboot ?
> I'm sure I've seen this bug before... and fixed it.
> I don't remember the details and cannot find anything obvious in change logs.
>
> You could try
>
> echo recover > /sys/block/md1/md/sync_action
>
> Alternately, if you are re-adding a disk that had just been removed, you could
>
> mdadm /dev/md1 --remove /dev/sdd2
> mdadm --zero /dev/sdd2
> mdadm /dev/md1 --add /dev/sdd2
>
> that will force a full recovery instead of just a bitmap-based recovery.
> That will of course take longer than a bitmap-based recover, but seeing the
> bitmap based recovery isn't starting, that could still be an improvement.
>
> NeilBrown
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-11-25 3:44 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-25 1:49 raid10 - won't rebuild - assigns all added disks as spares Jonathan Molyneux
2014-11-25 2:28 ` NeilBrown
2014-11-25 3:44 ` Jonathan Molyneux
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox