RAID5 attempt rebuilding despite being incomplete

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID5 attempt rebuilding despite being incomplete
@ 2005-06-23 11:31 bart
  2005-06-23 13:13 ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: bart @ 2005-06-23 11:31 UTC (permalink / raw)
  To: linux-raid

Hi,

I have the problem that my RAID5 (created with 4 drives) array is resyning
despite the fact I removed one drive. Any Idea what it is doing?

To trigger this:
- Created RAID5 array with 4 partitions (no spares).
- Set one drive to fail with 'mdadm --fail /dev/md3 /dev/hdf4'
- Removed this drive with 'mdadm --remove /dev/md3 /dev/hdf4'

After this I get:

---------------------------------------------------

# mdadm --detail /dev/md3
/dev/md3:
        Version : 00.90.01
  Creation Time : Thu Jun 23 11:43:59 2005
     Raid Level : raid5
     Array Size : 228151872 (217.58 GiB 233.63 GB)
    Device Size : 76050624 (72.53 GiB 77.88 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Thu Jun 23 12:26:19 2005
          State : active, degraded, resyncing
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 24% complete

           UUID : b9e2b624:b9bef9df:dcdec561:0c3a8dfd
         Events : 0.9

    Number   Major   Minor   RaidDevice State
       0       3        4        0      active sync   /dev/.static/dev/hda4
       1       3       68        1      active sync   /dev/.static/dev/hdb4
       2      33        4        2      active sync   /dev/.static/dev/hde4
       3       0        0        -      removed

--------------------------------------------------

and in /proc/mdstat is says:

--------------------------------------------------

md3 : active raid5 hde4[2] hdb4[1] hda4[0]
      228151872 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [=====>...............]  resync = 25.5% (19442692/76050624) finish=101.0min speed=9332K/sec

--------------------------------------------------

Removing all the drives seem leads to the array resyncing with 0 drives in
a loop, causing the /var/log/messages to grow at a rate of > 100MByte/sec :(

Isn't it a bug that the array starts resyning whithout checking if the
drives needed for a synced state are present?

	Bart

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 attempt rebuilding despite being incomplete
  2005-06-23 11:31 RAID5 attempt rebuilding despite being incomplete bart
@ 2005-06-23 13:13 ` Neil Brown
  2005-06-23 13:45   ` bart
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2005-06-23 13:13 UTC (permalink / raw)
  To: bart; +Cc: linux-raid

On Thursday June 23, bart@ardistech.com wrote:
> Hi,
> 
> I have the problem that my RAID5 (created with 4 drives) array is resyning
> despite the fact I removed one drive. Any Idea what it is doing?

Sound's familar.  Thought I had fixed it.  What kernel are you
running?

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 attempt rebuilding despite being incomplete
  2005-06-23 13:13 ` Neil Brown
@ 2005-06-23 13:45   ` bart
  2005-06-24  1:40     ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: bart @ 2005-06-23 13:45 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Hi Neil,

> > I have the problem that my RAID5 (created with 4 drives) array is resyning
> > despite the fact I removed one drive. Any Idea what it is doing?
> 
> Sound's familar.  Thought I had fixed it.  What kernel are you
> running?
> 
I'm running on a 2.6.11 kernel, it should be pretty up to date.

I forgot to mention that it only occurs if the drive is removed from the
RAID5 set before it is 'synced' for the first time.

Could it be the mddev->curr_resync or mddev->recovery_cp are handled wrong
in this case? I saw the prints:

..
Jun 23 12:26:10 172 kernel: md: checkpointing recovery of md3.
..
Jun 23 12:26:15 172 kernel: md: resuming recovery of md3 from checkpoint.
..

in the error situation, while there is no state to recover at all at that
point (the array is incomplete/degraded).

Cheers,
	Bart

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 attempt rebuilding despite being incomplete
  2005-06-23 13:45   ` bart
@ 2005-06-24  1:40     ` Neil Brown
  2005-06-24  8:20       ` bart
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2005-06-24  1:40 UTC (permalink / raw)
  To: bart; +Cc: linux-raid

On Thursday June 23, bart@ardistech.com wrote:
> Hi Neil,
> 
> > > I have the problem that my RAID5 (created with 4 drives) array is resyning
> > > despite the fact I removed one drive. Any Idea what it is doing?
> > 
> > Sound's familar.  Thought I had fixed it.  What kernel are you
> > running?
> > 
> I'm running on a 2.6.11 kernel, it should be pretty up to date.

I cannot find the patch I was thinking of to check when it went in,
but I have just tested various failure scenarios on 2.6.12-rc3-mm3 and
it handles them all properly.

If you could try 2.6.12 and confirm, I would appreciate it.


> 
> I forgot to mention that it only occurs if the drive is removed from the
> RAID5 set before it is 'synced' for the first time.
> 
> Could it be the mddev->curr_resync or mddev->recovery_cp are handled wrong
> in this case? I saw the prints:
> 
> ..
> Jun 23 12:26:10 172 kernel: md: checkpointing recovery of md3.
> ..
> Jun 23 12:26:15 172 kernel: md: resuming recovery of md3 from checkpoint.
> ..
> 
> in the error situation, while there is no state to recover at all at that
> point (the array is incomplete/degraded).

Yes, that shouldn't happen if there is an error, and it doesn't for
me.


I noticed that the raid5 was resyncing rather than recovering.
Normally when you create a raid5 with mdadm it will recover as this is
faster than resync.  Did you create the array with '-f' ??

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 attempt rebuilding despite being incomplete
  2005-06-24  1:40     ` Neil Brown
@ 2005-06-24  8:20       ` bart
  0 siblings, 0 replies; 5+ messages in thread
From: bart @ 2005-06-24  8:20 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Hi Neil,

> I cannot find the patch I was thinking of to check when it went in,
> but I have just tested various failure scenarios on 2.6.12-rc3-mm3 and
> it handles them all properly.
> 
> If you could try 2.6.12 and confirm, I would appreciate it.
> 
Ok, will try with 2.6.12

> I noticed that the raid5 was resyncing rather than recovering.
> Normally when you create a raid5 with mdadm it will recover as this is
> faster than resync.  Did you create the array with '-f' ??
> 
No, but the '--run' flag was set. I noticed that when ommiting the
'--run' flag (like 'mdadm --create /dev/md3 --level=5 --raid-devices=4
/dev/hd[abef]4') the array goes into recovery straight away, so 
probably the '--run' flag forces a '-f'.

It also looks like the '--detail --test' flag of mdadm (to obtain the array
status as return value) does not work (always returns 0). I'll start a
different thread on that when I tested it further.

Cheers,
	Bart

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-06-24  8:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-23 11:31 RAID5 attempt rebuilding despite being incomplete bart
2005-06-23 13:13 ` Neil Brown
2005-06-23 13:45   ` bart
2005-06-24  1:40     ` Neil Brown
2005-06-24  8:20       ` bart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).