* RAID5 attempt rebuilding despite being incomplete
@ 2005-06-23 11:31 bart
2005-06-23 13:13 ` Neil Brown
0 siblings, 1 reply; 5+ messages in thread
From: bart @ 2005-06-23 11:31 UTC (permalink / raw)
To: linux-raid
Hi,
I have the problem that my RAID5 (created with 4 drives) array is resyning
despite the fact I removed one drive. Any Idea what it is doing?
To trigger this:
- Created RAID5 array with 4 partitions (no spares).
- Set one drive to fail with 'mdadm --fail /dev/md3 /dev/hdf4'
- Removed this drive with 'mdadm --remove /dev/md3 /dev/hdf4'
After this I get:
---------------------------------------------------
# mdadm --detail /dev/md3
/dev/md3:
Version : 00.90.01
Creation Time : Thu Jun 23 11:43:59 2005
Raid Level : raid5
Array Size : 228151872 (217.58 GiB 233.63 GB)
Device Size : 76050624 (72.53 GiB 77.88 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Thu Jun 23 12:26:19 2005
State : active, degraded, resyncing
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 24% complete
UUID : b9e2b624:b9bef9df:dcdec561:0c3a8dfd
Events : 0.9
Number Major Minor RaidDevice State
0 3 4 0 active sync /dev/.static/dev/hda4
1 3 68 1 active sync /dev/.static/dev/hdb4
2 33 4 2 active sync /dev/.static/dev/hde4
3 0 0 - removed
--------------------------------------------------
and in /proc/mdstat is says:
--------------------------------------------------
md3 : active raid5 hde4[2] hdb4[1] hda4[0]
228151872 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
[=====>...............] resync = 25.5% (19442692/76050624) finish=101.0min speed=9332K/sec
--------------------------------------------------
Removing all the drives seem leads to the array resyncing with 0 drives in
a loop, causing the /var/log/messages to grow at a rate of > 100MByte/sec :(
Isn't it a bug that the array starts resyning whithout checking if the
drives needed for a synced state are present?
Bart
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: RAID5 attempt rebuilding despite being incomplete
2005-06-23 11:31 RAID5 attempt rebuilding despite being incomplete bart
@ 2005-06-23 13:13 ` Neil Brown
2005-06-23 13:45 ` bart
0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2005-06-23 13:13 UTC (permalink / raw)
To: bart; +Cc: linux-raid
On Thursday June 23, bart@ardistech.com wrote:
> Hi,
>
> I have the problem that my RAID5 (created with 4 drives) array is resyning
> despite the fact I removed one drive. Any Idea what it is doing?
Sound's familar. Thought I had fixed it. What kernel are you
running?
NeilBrown
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 attempt rebuilding despite being incomplete
2005-06-23 13:13 ` Neil Brown
@ 2005-06-23 13:45 ` bart
2005-06-24 1:40 ` Neil Brown
0 siblings, 1 reply; 5+ messages in thread
From: bart @ 2005-06-23 13:45 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Hi Neil,
> > I have the problem that my RAID5 (created with 4 drives) array is resyning
> > despite the fact I removed one drive. Any Idea what it is doing?
>
> Sound's familar. Thought I had fixed it. What kernel are you
> running?
>
I'm running on a 2.6.11 kernel, it should be pretty up to date.
I forgot to mention that it only occurs if the drive is removed from the
RAID5 set before it is 'synced' for the first time.
Could it be the mddev->curr_resync or mddev->recovery_cp are handled wrong
in this case? I saw the prints:
..
Jun 23 12:26:10 172 kernel: md: checkpointing recovery of md3.
..
Jun 23 12:26:15 172 kernel: md: resuming recovery of md3 from checkpoint.
..
in the error situation, while there is no state to recover at all at that
point (the array is incomplete/degraded).
Cheers,
Bart
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 attempt rebuilding despite being incomplete
2005-06-23 13:45 ` bart
@ 2005-06-24 1:40 ` Neil Brown
2005-06-24 8:20 ` bart
0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2005-06-24 1:40 UTC (permalink / raw)
To: bart; +Cc: linux-raid
On Thursday June 23, bart@ardistech.com wrote:
> Hi Neil,
>
> > > I have the problem that my RAID5 (created with 4 drives) array is resyning
> > > despite the fact I removed one drive. Any Idea what it is doing?
> >
> > Sound's familar. Thought I had fixed it. What kernel are you
> > running?
> >
> I'm running on a 2.6.11 kernel, it should be pretty up to date.
I cannot find the patch I was thinking of to check when it went in,
but I have just tested various failure scenarios on 2.6.12-rc3-mm3 and
it handles them all properly.
If you could try 2.6.12 and confirm, I would appreciate it.
>
> I forgot to mention that it only occurs if the drive is removed from the
> RAID5 set before it is 'synced' for the first time.
>
> Could it be the mddev->curr_resync or mddev->recovery_cp are handled wrong
> in this case? I saw the prints:
>
> ..
> Jun 23 12:26:10 172 kernel: md: checkpointing recovery of md3.
> ..
> Jun 23 12:26:15 172 kernel: md: resuming recovery of md3 from checkpoint.
> ..
>
> in the error situation, while there is no state to recover at all at that
> point (the array is incomplete/degraded).
Yes, that shouldn't happen if there is an error, and it doesn't for
me.
I noticed that the raid5 was resyncing rather than recovering.
Normally when you create a raid5 with mdadm it will recover as this is
faster than resync. Did you create the array with '-f' ??
NeilBrown
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 attempt rebuilding despite being incomplete
2005-06-24 1:40 ` Neil Brown
@ 2005-06-24 8:20 ` bart
0 siblings, 0 replies; 5+ messages in thread
From: bart @ 2005-06-24 8:20 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Hi Neil,
> I cannot find the patch I was thinking of to check when it went in,
> but I have just tested various failure scenarios on 2.6.12-rc3-mm3 and
> it handles them all properly.
>
> If you could try 2.6.12 and confirm, I would appreciate it.
>
Ok, will try with 2.6.12
> I noticed that the raid5 was resyncing rather than recovering.
> Normally when you create a raid5 with mdadm it will recover as this is
> faster than resync. Did you create the array with '-f' ??
>
No, but the '--run' flag was set. I noticed that when ommiting the
'--run' flag (like 'mdadm --create /dev/md3 --level=5 --raid-devices=4
/dev/hd[abef]4') the array goes into recovery straight away, so
probably the '--run' flag forces a '-f'.
It also looks like the '--detail --test' flag of mdadm (to obtain the array
status as return value) does not work (always returns 0). I'll start a
different thread on that when I tested it further.
Cheers,
Bart
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-06-24 8:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-23 11:31 RAID5 attempt rebuilding despite being incomplete bart
2005-06-23 13:13 ` Neil Brown
2005-06-23 13:45 ` bart
2005-06-24 1:40 ` Neil Brown
2005-06-24 8:20 ` bart
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).