raid5: degraded after reboot

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* raid5: degraded after reboot
@ 2007-10-12 15:38 Jon Nelson
  2007-10-12 15:47 ` Andre Noll
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Nelson @ 2007-10-12 15:38 UTC (permalink / raw)
  To: linux-raid

I have a software raid5 using /dev/sd{a,b,c}4.
It's been up for months, through many reboots.

I had to do a reboot using sysrq

When the box came back up, the raid did not re-assemble.
I am not using bitmaps.

I believe it comes down to this:

<4>md: kicking non-fresh sda4 from array!

what does that mean?

I also have this:

raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
 --- rd:3 wd:2 fd:1
 disk 1, o:1, dev:sdb4
 disk 2, o:1, dev:sdc4
mdadm: forcing event count in /dev/sdb4(1) from 327615 upto 327626

Why was /dev/sda4 kicked?

Contents of /etc/mdadm.conf:

DEVICE /dev/hd*[a-h][0-9] /dev/sd*[a-h][0-9]
ARRAY /dev/md0 level=raid5 num-devices=3
UUID=b4597c3f:ab953cb9:32634717:ca110bfc

Current /proc/mdstat:

md0 : active raid5 sda4[3] sdb4[1] sdc4[2]
      613409664 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU]
      [==>..................]  recovery = 13.1% (40423368/306704832)
finish=68.8min speed=64463K/sec

65-70KB/s is about what these drives can do so the rebuild speed is just peachy.

-- 
Jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 15:38 raid5: degraded after reboot Jon Nelson
@ 2007-10-12 15:47 ` Andre Noll
  2007-10-12 16:08   ` Jon Nelson
  2007-10-12 18:58   ` Bill Davidsen
  0 siblings, 2 replies; 7+ messages in thread
From: Andre Noll @ 2007-10-12 15:47 UTC (permalink / raw)
  To: Jon Nelson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1104 bytes --]

On 10:38, Jon Nelson wrote:
> <4>md: kicking non-fresh sda4 from array!
> 
> what does that mean?

sda4 was not included because the array has been assembled previously
using only sdb4 and sdc4. So the data on sda4 is out of date.

> I also have this:
> 
> raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
> RAID5 conf printout:
>  --- rd:3 wd:2 fd:1
>  disk 1, o:1, dev:sdb4
>  disk 2, o:1, dev:sdc4

This looks normal. The array is up with two working disks.

> Why was /dev/sda4 kicked?

Because it was non-fresh ;)

> md0 : active raid5 sda4[3] sdb4[1] sdc4[2]
>       613409664 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU]
>       [==>..................]  recovery = 13.1% (40423368/306704832)
> finish=68.8min speed=64463K/sec

Seems like your init scripts re-added sda4.

> 65-70KB/s is about what these drives can do so the rebuild speed is just peachy.

If the rebuild completes successfully, you're ok again.  There's
nothing you have to do.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 15:47 ` Andre Noll
@ 2007-10-12 16:08   ` Jon Nelson
  2007-10-12 16:18     ` Andre Noll
  2007-10-12 18:58   ` Bill Davidsen
  1 sibling, 1 reply; 7+ messages in thread
From: Jon Nelson @ 2007-10-12 16:08 UTC (permalink / raw)
  Cc: linux-raid

On 10/12/07, Andre Noll <maan@systemlinux.org> wrote:
> On 10:38, Jon Nelson wrote:
> > <4>md: kicking non-fresh sda4 from array!
> >
> > what does that mean?
>
> sda4 was not included because the array has been assembled previously
> using only sdb4 and sdc4. So the data on sda4 is out of date.

I don't understand - over months and months it has always been the three
devices, /dev/sd{a,b,c}4.
I've added and removed bitmaps and done other things but at the time it
rebooted the array had been up, "clean" (non-degraded), and comprised of the
three devices for 4-6 weeks.

> > I also have this:
> >
> > raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
> > RAID5 conf printout:
> >  --- rd:3 wd:2 fd:1
> >  disk 1, o:1, dev:sdb4
> >  disk 2, o:1, dev:sdc4
>
> This looks normal. The array is up with two working disks.

Two of three which, to me, is "abnormal" (ie, the "normal" state is three
and it's got two).

> > Why was /dev/sda4 kicked?
>
> Because it was non-fresh ;)

OK, but what does that MEAN?


> > md0 : active raid5 sda4[3] sdb4[1] sdc4[2]
> >       613409664 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU]
> >       [==>..................]  recovery = 13.1% (40423368/306704832)
> > finish=68.8min speed=64463K/sec
>
> Seems like your init scripts re-added sda4.

No, I did this by hand. I forgot to say that.

--
Jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 16:08   ` Jon Nelson
@ 2007-10-12 16:18     ` Andre Noll
  2007-10-12 17:05       ` Jon Nelson
  0 siblings, 1 reply; 7+ messages in thread
From: Andre Noll @ 2007-10-12 16:18 UTC (permalink / raw)
  To: Jon Nelson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1601 bytes --]

On 11:08, Jon Nelson wrote:

> > sda4 was not included because the array has been assembled previously
> > using only sdb4 and sdc4. So the data on sda4 is out of date.
> 
> I don't understand - over months and months it has always been the three
> devices, /dev/sd{a,b,c}4.
> I've added and removed bitmaps and done other things but at the time it
> rebooted the array had been up, "clean" (non-degraded), and comprised of the
> three devices for 4-6 weeks.

You said you had to reboot your box using sysrq. There are chances you
caused the reboot while all pending data was written to sdb4 and sdc4,
but not to sda4. So sda4 appears to be non-fresh after the reboot and,
since mdadm refuses to use non-fresh devices, it kicks sda4.

> > This looks normal. The array is up with two working disks.
> 
> Two of three which, to me, is "abnormal" (ie, the "normal" state is three
> and it's got two).

Sure. I should have said: It's normal if one disk in a raid5 array is
missing (or non-fresh).

> > > Why was /dev/sda4 kicked?
> >
> > Because it was non-fresh ;)
> 
> OK, but what does that MEAN?

To be precise, it means that the event counter for sda4 is less than
the event counter on the other devices in the array. So mdadm must
assume the data on sda4 is out of sync and hence the device can't be
used. If you are not using bitmaps, there is no other way out than
syncing the whole device, i.e. writing good data (computed from sdb4
and sdc4) to sda4.

Hope that helps.
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 16:18     ` Andre Noll
@ 2007-10-12 17:05       ` Jon Nelson
  2007-10-12 18:32         ` Andre Noll
  0 siblings, 1 reply; 7+ messages in thread
From: Jon Nelson @ 2007-10-12 17:05 UTC (permalink / raw)
  To: Andre Noll; +Cc: linux-raid

> You said you had to reboot your box using sysrq. There are chances you
> caused the reboot while all pending data was written to sdb4 and sdc4,
> but not to sda4. So sda4 appears to be non-fresh after the reboot and,
> since mdadm refuses to use non-fresh devices, it kicks sda4.

Can mdadm be told to use non-fresh devices?
What about sdb4: I can understand rewinding an event count (sorta),
but what does this mean:

mdadm: forcing event count in /dev/sdb4(1) from 327615 upto 327626

Since the array is degraded, there are 11 "events" missing from sdb4
(presumably sdc4 had them). Since sda4 is not part of the array, the
events can't be complete, can they?  Why jump *ahead* on events
instead of rewinding?


> Sure. I should have said: It's normal if one disk in a raid5 array is
> missing (or non-fresh).

I do not have a spare for this raid - I am aware of the risks and
mitigate them in other ways.

> To be precise, it means that the event counter for sda4 is less than
> the event counter on the other devices in the array. So mdadm must
> assume the data on sda4 is out of sync and hence the device can't be
> used. If you are not using bitmaps, there is no other way out than
> syncing the whole device, i.e. writing good data (computed from sdb4
> and sdc4) to sda4.
>
> Hope that helps.

Yes, that helps.

-- 
Jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 17:05       ` Jon Nelson
@ 2007-10-12 18:32         ` Andre Noll
  0 siblings, 0 replies; 7+ messages in thread
From: Andre Noll @ 2007-10-12 18:32 UTC (permalink / raw)
  To: Jon Nelson; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2088 bytes --]

On 12:05, Jon Nelson wrote:

> Can mdadm be told to use non-fresh devices?

Sure. --force does the trick.

> What about sdb4: I can understand rewinding an event count (sorta),
> but what does this mean:
> 
> mdadm: forcing event count in /dev/sdb4(1) from 327615 upto 327626

Well, it means the event counter was advanced forward by 11 events. I'm
not sure under which circumstances this message is printed though.
Clearly, after a successful resync the event counter of the added disk
is adjusted to match the value of the rest of the array. AFAIK, also
assembling an array using --force would cause such an adjustment.

> Since the array is degraded, there are 11 "events" missing from sdb4
> (presumably sdc4 had them). Since sda4 is not part of the array, the
> events can't be complete, can they?

There's no such thing as a "complete event". An event for example
happens, when the array gets assembled, or if the superblock changes
due to the user adding bitmap support. The event counter is simply
a number which is written to each device of a raid array and which
is increased whenever an event occurs.  Note that changes to the
underlying file system do not cause events, so the data on the disk
may change completely although the event counter stays the same.

Normally all devices contain the same count. But if you, for example,
yank out a drive and assemble the array without that drive, the number
on that drive isn't increased, obviously. If you plug in again the
drive later and try to assemble the array, that drive has a lower
event count than all other drives, i.e. it's non-fresh. Since any
number of changes might have happened during the time the array was
degraded the data on the non-fresh drive can not be trusted, which
means it must not be used when assembling the array. So the drive is
kicked and a full sync is necessary.

>   Why jump *ahead* on events
> instead of rewinding?

That wouln't buy you anything. No good idea.

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: raid5: degraded after reboot
  2007-10-12 15:47 ` Andre Noll
  2007-10-12 16:08   ` Jon Nelson
@ 2007-10-12 18:58   ` Bill Davidsen
  1 sibling, 0 replies; 7+ messages in thread
From: Bill Davidsen @ 2007-10-12 18:58 UTC (permalink / raw)
  To: Andre Noll; +Cc: Jon Nelson, linux-raid

Andre Noll wrote:
> On 10:38, Jon Nelson wrote:
>   
>> <4>md: kicking non-fresh sda4 from array!
>>
>> what does that mean?
>>     
>
> sda4 was not included because the array has been assembled previously
> using only sdb4 and sdc4. So the data on sda4 is out of date.
>
>   
>> I also have this:
>>
>> raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
>> RAID5 conf printout:
>>  --- rd:3 wd:2 fd:1
>>  disk 1, o:1, dev:sdb4
>>  disk 2, o:1, dev:sdc4
>>     
>
> This looks normal. The array is up with two working disks.
>
>   
>> Why was /dev/sda4 kicked?
>>     
>
> Because it was non-fresh ;)
>
>   
>> md0 : active raid5 sda4[3] sdb4[1] sdc4[2]
>>       613409664 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU]
>>       [==>..................]  recovery = 13.1% (40423368/306704832)
>> finish=68.8min speed=64463K/sec
>>     
>
> Seems like your init scripts re-added sda4.
>
>   
>> 65-70KB/s is about what these drives can do so the rebuild speed is just peachy.
>>     
>
> If the rebuild completes successfully, you're ok again.  There's
> nothing you have to do.
>   

What you didn't say is that "doing nothing" is not only all that's 
required, but where possible it's the best thing *to* do, avoiding any 
testing of the "recover while active" logic, any extra seeking, etc.

And I would be sure I understood why the system had to be force booted. 
If sysreq is working I would assume using 's' before 'b' is normal. 'Tis 
on my systems!


-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-10-12 18:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-12 15:38 raid5: degraded after reboot Jon Nelson
2007-10-12 15:47 ` Andre Noll
2007-10-12 16:08   ` Jon Nelson
2007-10-12 16:18     ` Andre Noll
2007-10-12 17:05       ` Jon Nelson
2007-10-12 18:32         ` Andre Noll
2007-10-12 18:58   ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).