* RAID5 kicks non-fresh drives
@ 2006-05-25 15:38 Craig Hollabaugh
2006-05-25 21:18 ` Neil Brown
0 siblings, 1 reply; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-25 15:38 UTC (permalink / raw)
To: linux-raid
Folks,
I had two drives fails on a 13 drive RAID5 array with bad-blocks
(confirmed this with external disk scan). I replaced and hot-added new
drives back into array. Resync completed without incident. I moved the
machine back into production and after reboot, the two new drives get
kicked out of array for being non-fresh. Everything I try results in
these two drives always getting kicked out.
Here's what I tried.
searched and read for at least 10 hours for info on kicking
"non-fresh"
hot-adding then rebooting 5 times with same result
using kernel 2.4.30, 2.6.11.8 and 2.6.16.8.
(resync takes 4 hours to complete, so iterations take a while)
mdadm version is v1.12
after the resync before the reboot, manual stopping and starting the
array
always in correct operation (no kicking of drives)
My questions are
1. How does a drive become non-fresh?
2. Is the non-fresh status related to 'events'?
3. How can I determine that all the drives are fresh before a reboot?
4. 2.4.30 and 2.6.11.8 dmesg output mentions kicking non-fresh drives.
2.6.16.8 doesn't even consider my new drives, see "after reboot" below
After a resync, how can I determine that all my drives are actually
part of the array?
mdadm -E /dev/sdX1 for each drive shows the same info.
5. From everything I've tried, the array looks fine before the reboot.
But no matter
what I've tried, the drives are kicked upon reboot.
6. /proc/mdstat reports "Personalities : [raid5] [raid4]", the array is
raid5,
where raid4 come from?
Thanks for reading this and any suggestions you can offer.
Craig
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
The two drives in question are sdj1 and sdk1.
Here's output after the resync before the reboot
root@vaughan[502]: cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdj1[12](S) sdk1[9] sda1[0] sdl1[11] hdc1[10] sdd1[8]
sdh1[7] sdg1[6] sdf1[5] sde1[4] sdi1[3] sdc1[2] sdb1[1]
1289056384 blocks level 5, 128k chunk, algorithm 2 [12/12]
[UUUUUUUUUUUU]
unused devices: <none>
root@vaughan[501]: mdadm -D /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Thu Jan 16 09:10:52 2003
Raid Level : raid5
Array Size : 1289056384 (1229.34 GiB 1319.99 GB)
Device Size : 117186944 (111.76 GiB 120.00 GB)
Raid Devices : 12
Total Devices : 13
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu May 25 05:36:58 2006
State : clean
Active Devices : 12
Working Devices : 13
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 128K
UUID : 4d862825:91140f1a:eb97e7f2:9bfa2403
Events : 0.2681049
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 129 3 active sync /dev/sdi1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 8 113 7 active sync /dev/sdh1
8 8 49 8 active sync /dev/sdd1
9 8 161 9 active sync /dev/sdk1
10 22 1 10 active sync /dev/hdc1
11 8 177 11 active sync /dev/sdl1
12 8 145 - spare /dev/sdj1
root@vaughan[512]: mdadm -E /dev/sdj1
/dev/sdj1:
Magic : a92b4efc
Version : 00.90.00
UUID : 4d862825:91140f1a:eb97e7f2:9bfa2403
Creation Time : Thu Jan 16 09:10:52 2003
Raid Level : raid5
Raid Devices : 12
Total Devices : 13
Preferred Minor : 0
Update Time : Thu May 25 05:36:58 2006
State : clean
Active Devices : 12
Working Devices : 13
Failed Devices : 0
Spare Devices : 1
Checksum : 9943fc98 - correct
Events : 0.2681049
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 12 8 145 12 spare /dev/sdj1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 129 3 active sync /dev/sdi1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
8 8 8 49 8 active sync /dev/sdd1
9 9 8 161 9 active sync /dev/sdk1
10 10 22 1 10 active sync /dev/hdc1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 145 12 spare /dev/sdj1
------------------------------------------------------------------------------------------------
Now after reboot
root@vaughan[542]: uname -a
Linux vaughan 2.6.16.8 #1 Wed May 24 15:00:27 MDT 2006 i686 GNU/Linux
From dmesg
md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdl1 ...
md: adding sdl1 ...
md: adding sdi1 ...
md: adding sdh1 ...
md: adding sdg1 ...
md: adding sdf1 ...
md: adding sde1 ...
md: adding sdd1 ...
md: adding sdc1 ...
md: adding sdb1 ...
md: adding sda1 ...
md: adding hdc1 ...
md: created md0
The kernel didn't add sdj or sdk.
root@vaughan[501]: cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdl1[11] sdi1[3] sdh1[7] sdg1[6] sdf1[5] sde1[4]
sdd1[8] sdc1[2] sdb1[1] sda1[0] hdc1[10]
1289056384 blocks level 5, 128k chunk, algorithm 2 [12/11]
[UUUUUUUUU_UU]
unused devices: <none>
root@vaughan[502]: mdadm -D /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Thu Jan 16 09:10:52 2003
Raid Level : raid5
Array Size : 1289056384 (1229.34 GiB 1319.99 GB)
Device Size : 117186944 (111.76 GiB 120.00 GB)
Raid Devices : 12
Total Devices : 11
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu May 25 05:36:58 2006
State : clean, degraded
Active Devices : 11
Working Devices : 11
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 128K
UUID : 4d862825:91140f1a:eb97e7f2:9bfa2403
Events : 0.2681049
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 129 3 active sync /dev/sdi1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 8 113 7 active sync /dev/sdh1
8 8 49 8 active sync /dev/sdd1
9 0 0 - removed
10 22 1 10 active sync /dev/hdc1
11 8 177 11 active sync /dev/sdl1
root@vaughan[512]: mdadm -E /dev/sdj1
/dev/sdj1:
Magic : a92b4efc
Version : 00.90.00
UUID : 4d862825:91140f1a:eb97e7f2:9bfa2403
Creation Time : Thu Jan 16 09:10:52 2003
Raid Level : raid5
Raid Devices : 12
Total Devices : 13
Preferred Minor : 0
Update Time : Thu May 25 05:36:58 2006
State : clean
Active Devices : 12
Working Devices : 13
Failed Devices : 0
Spare Devices : 1
Checksum : 9943fc98 - correct
Events : 0.2681049
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 12 8 145 12 spare /dev/sdj1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 129 3 active sync /dev/sdi1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
8 8 8 49 8 active sync /dev/sdd1
9 9 8 161 9 active sync /dev/sdk1
10 10 22 1 10 active sync /dev/hdc1
11 11 8 177 11 active sync /dev/sdl1
12 12 8 145 12 spare /dev/sdj1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-25 15:38 RAID5 kicks non-fresh drives Craig Hollabaugh
@ 2006-05-25 21:18 ` Neil Brown
2006-05-25 21:39 ` Craig Hollabaugh
2006-05-25 22:30 ` Craig Hollabaugh
0 siblings, 2 replies; 17+ messages in thread
From: Neil Brown @ 2006-05-25 21:18 UTC (permalink / raw)
To: Craig Hollabaugh; +Cc: linux-raid
On Thursday May 25, craig@hollabaugh.com wrote:
>
> >From dmesg
> md: Autodetecting RAID arrays.
> md: autorun ...
> md: considering sdl1 ...
> md: adding sdl1 ...
> md: adding sdi1 ...
> md: adding sdh1 ...
> md: adding sdg1 ...
> md: adding sdf1 ...
> md: adding sde1 ...
> md: adding sdd1 ...
> md: adding sdc1 ...
> md: adding sdb1 ...
> md: adding sda1 ...
> md: adding hdc1 ...
> md: created md0
>
> The kernel didn't add sdj or sdk.
>
And the partition types of sdj1 and sdk1 are ???
NeilBrown
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-25 21:18 ` Neil Brown
@ 2006-05-25 21:39 ` Craig Hollabaugh
2006-05-25 22:30 ` Craig Hollabaugh
1 sibling, 0 replies; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-25 21:39 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Neil,
sdj and sdk are FS Type 'Linux'.
all the other partitions are FS Type 'Linux raid autodetect'
I don't remember ever having to set the partition type. It's not in my
build notes for this machine or another 13 drive server.
Should I change partition type to 'Linux raid autodetect'?
If so, how can I verify array configuration prior to rebooting?
Thanks for the reply Neil. I never checked that.
Craig
ps. My new drives are certainly getting a workout through this learning
process.
On Fri, 2006-05-26 at 07:18 +1000, Neil Brown wrote:
> On Thursday May 25, craig@hollabaugh.com wrote:
> >
> > >From dmesg
> > md: Autodetecting RAID arrays.
> > md: autorun ...
> > md: considering sdl1 ...
> > md: adding sdl1 ...
> > md: adding sdi1 ...
> > md: adding sdh1 ...
> > md: adding sdg1 ...
> > md: adding sdf1 ...
> > md: adding sde1 ...
> > md: adding sdd1 ...
> > md: adding sdc1 ...
> > md: adding sdb1 ...
> > md: adding sda1 ...
> > md: adding hdc1 ...
> > md: created md0
> >
> > The kernel didn't add sdj or sdk.
> >
>
> And the partition types of sdj1 and sdk1 are ???
>
> NeilBrown
>
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-25 21:18 ` Neil Brown
2006-05-25 21:39 ` Craig Hollabaugh
@ 2006-05-25 22:30 ` Craig Hollabaugh
2006-05-26 7:57 ` Mikael Abrahamsson
1 sibling, 1 reply; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-25 22:30 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On Fri, 2006-05-26 at 07:18 +1000, Neil Brown wrote:
> And the partition types of sdj1 and sdk1 are ???
Neil,
That did it! I set the partition FS Types from 'Linux' to 'Linux raid
autodetect' after my last re-sync completed. Manually stopped and
started the array. Things looked good, so I crossed my fingers and
rebooted. The kernel found all the drives and all is happy here in
Colorado.
Thanks ever so much for your comment!!
Craig
After the reboot
root@vaughan[501]: mdadm -D /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Thu Jan 16 09:10:52 2003
Raid Level : raid5
Array Size : 1289056384 (1229.34 GiB 1319.99 GB)
Device Size : 117186944 (111.76 GiB 120.00 GB)
Raid Devices : 12
Total Devices : 13
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu May 25 16:21:28 2006
State : clean
Active Devices : 12
Working Devices : 13
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 128K
UUID : 4d862825:91140f1a:eb97e7f2:9bfa2403
Events : 0.2684360
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 129 3 active sync /dev/sdi1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 8 113 7 active sync /dev/sdh1
8 8 49 8 active sync /dev/sdd1
9 8 161 9 active sync /dev/sdk1
10 22 1 10 active sync /dev/hdc1
11 8 177 11 active sync /dev/sdl1
12 8 145 - spare /dev/sdj1
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-25 22:30 ` Craig Hollabaugh
@ 2006-05-26 7:57 ` Mikael Abrahamsson
2006-05-26 14:11 ` Craig Hollabaugh
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Mikael Abrahamsson @ 2006-05-26 7:57 UTC (permalink / raw)
To: Craig Hollabaugh; +Cc: Neil Brown, linux-raid
On Thu, 25 May 2006, Craig Hollabaugh wrote:
> That did it! I set the partition FS Types from 'Linux' to 'Linux raid
> autodetect' after my last re-sync completed. Manually stopped and
> started the array. Things looked good, so I crossed my fingers and
> rebooted. The kernel found all the drives and all is happy here in
> Colorado.
Would it make sense for the raid code to somehow warn in the log when a
device in a raid set doesn't have "Linux raid autodetect" partition type?
If this was in "dmesg", would you have spotted the problem before?
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 7:57 ` Mikael Abrahamsson
@ 2006-05-26 14:11 ` Craig Hollabaugh
2006-05-26 16:45 ` Mark Hahn
2006-05-26 17:32 ` Bill Davidsen
2006-05-29 5:20 ` Neil Brown
2 siblings, 1 reply; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-26 14:11 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: Neil Brown, linux-raid
I had no idea about this particular configuration requirement. None of
my reading mentioned setting the partition type. I originally created
the array 1/2003 and don't remember having to set it. So, yes, more
debugging info in dmesg would have saved me days of
resyncing/tweak/reboot/resync cycles. (I'm not complaining, just very
relieved to up and running again).
On Fri, 2006-05-26 at 09:57 +0200, Mikael Abrahamsson wrote:
> On Thu, 25 May 2006, Craig Hollabaugh wrote:
>
> > That did it! I set the partition FS Types from 'Linux' to 'Linux raid
> > autodetect' after my last re-sync completed. Manually stopped and
> > started the array. Things looked good, so I crossed my fingers and
> > rebooted. The kernel found all the drives and all is happy here in
> > Colorado.
>
> Would it make sense for the raid code to somehow warn in the log when a
> device in a raid set doesn't have "Linux raid autodetect" partition type?
> If this was in "dmesg", would you have spotted the problem before?
>
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 14:11 ` Craig Hollabaugh
@ 2006-05-26 16:45 ` Mark Hahn
2006-05-26 17:06 ` Craig Hollabaugh
2006-05-29 4:34 ` Neil Brown
0 siblings, 2 replies; 17+ messages in thread
From: Mark Hahn @ 2006-05-26 16:45 UTC (permalink / raw)
To: Craig Hollabaugh; +Cc: linux-raid
> I had no idea about this particular configuration requirement. None of
just to be clear: it's not a requirement. if you want the very nice
auto-assembling behavior, you need to designate the auto-assemblable
partitions. but you can assemble "manually" without 0xfd partitions
(even if that's in an initrd, for instance.)
I think the current situation is good, since there is some danger of
going too far. for instance, testing each partition to see whether
it contains a valid superblock would be pretty crazy, right? requiring
either the "auto-assemble-me" partition type, or explicit partitions
given in a config file is a happy medium...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 16:45 ` Mark Hahn
@ 2006-05-26 17:06 ` Craig Hollabaugh
2006-05-26 17:30 ` Mark Hahn
2006-05-26 18:38 ` Luca Berra
2006-05-29 4:34 ` Neil Brown
1 sibling, 2 replies; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-26 17:06 UTC (permalink / raw)
To: Mark Hahn; +Cc: linux-raid
On Fri, 2006-05-26 at 12:45 -0400, Mark Hahn wrote:
> I think the current situation is good, since there is some danger of
> going too far. for instance, testing each partition to see whether
> it contains a valid superblock would be pretty crazy, right? requiring
> either the "auto-assemble-me" partition type, or explicit partitions
> given in a config file is a happy medium...
>
I created my array in 1/2003, don't know versions of kernel or mdadm I
was using then.
In my situation over the past few days.
kernel 2.4.30 kicked non-fresh
kernel 2.6.11.8 kicked non-fresh
kernel 2.6.18.8 didn't mention anything, just skipped my 'linux'
partitions
These kernels auto-assemble prior to mounting /. So the kernel doesn't
consult my
/etc/mdadm/mdadm.conf file. Is this correct?
>
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 17:06 ` Craig Hollabaugh
@ 2006-05-26 17:30 ` Mark Hahn
2006-05-26 18:01 ` Craig Hollabaugh
2006-05-26 18:38 ` Luca Berra
1 sibling, 1 reply; 17+ messages in thread
From: Mark Hahn @ 2006-05-26 17:30 UTC (permalink / raw)
To: Craig Hollabaugh; +Cc: linux-raid
>>>> I created my array in 1/2003, don't know versions of kernel or mdadm I
> was using then.
did you have /etc/*md* related config files? some distros use
them to assemble during boot (not quite the same as 0xfd auto-assembly,
but still pretty "auto").
> In my situation over the past few days.
> kernel 2.4.30 kicked non-fresh
> kernel 2.6.11.8 kicked non-fresh
> kernel 2.6.18.8 didn't mention anything, just skipped my 'linux'
> partitions
>
> These kernels auto-assemble prior to mounting /. So the kernel doesn't
> consult my
> /etc/mdadm/mdadm.conf file. Is this correct?
yes - the kernel traditionally doesn't, of its own accord, read files.
most stuff under /etc are inputs to user-level tools that run during
boot to instruct the kernel how to configure things. distros have,
in the past, had boot-time scripts that would run mdadm and thus
read your mdadm.conf (or the raid config files that predate mdadm...)
so perhaps your observed change in behavior had to do with distro changes...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 7:57 ` Mikael Abrahamsson
2006-05-26 14:11 ` Craig Hollabaugh
@ 2006-05-26 17:32 ` Bill Davidsen
2006-05-26 17:49 ` Craig Hollabaugh
2006-05-29 5:20 ` Neil Brown
2 siblings, 1 reply; 17+ messages in thread
From: Bill Davidsen @ 2006-05-26 17:32 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: Craig Hollabaugh, Neil Brown, linux-raid
Mikael Abrahamsson wrote:
> On Thu, 25 May 2006, Craig Hollabaugh wrote:
>
>> That did it! I set the partition FS Types from 'Linux' to 'Linux raid
>> autodetect' after my last re-sync completed. Manually stopped and
>> started the array. Things looked good, so I crossed my fingers and
>> rebooted. The kernel found all the drives and all is happy here in
>> Colorado.
>
>
> Would it make sense for the raid code to somehow warn in the log when
> a device in a raid set doesn't have "Linux raid autodetect" partition
> type? If this was in "dmesg", would you have spotted the problem before?
>
As long as it is written where logwatch will see it, not recognize it,
and report it... People who don't read their logwatch reports get no
sympathy from me.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 17:32 ` Bill Davidsen
@ 2006-05-26 17:49 ` Craig Hollabaugh
0 siblings, 0 replies; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-26 17:49 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Mikael Abrahamsson, Neil Brown, linux-raid
Mikael and others,
I forgot to answer your question from a previous post. Yes, if I had
received a warning in dmesg, I would have spotted this problem. Or at
least been pointed to something to research. When I switched to the
newest kernel, I didn't even get the kicking non-fresh message, just a
list of added drives. The lack of information got me even more
concerned.
From a user perspective, here's where the disconnect occurred for me.
After the re-sync and my array was stable running with a spare, I could
start it and stop it, mount it and unmount it without any issues. Whew,
things are looking good, my data is safe. I thought everything was good
to go. I reboot the machine and my array comes up degraded. mdadm -D
reports something completely different than what it reported before the
reboot. dmesg gives little clues about kernel raid build process.
The disconnect for me occurs between mdadm assembling the array from
userspace and the kernel auto-detecting, binding and running. I was
under the impression that mdadm and the kernel assemble arrays in the
same fashion. In my situation where my new drive's partition types were
different, that's not quite true.
Thanks for the help.
Craig
ps. I'm old-school here, none of my 10+ Linux hosts run logwatch, dmesg
is fine for me.
On Fri, 2006-05-26 at 13:32 -0400, Bill Davidsen wrote:
> Mikael Abrahamsson wrote:
>
> > On Thu, 25 May 2006, Craig Hollabaugh wrote:
> >
> >> That did it! I set the partition FS Types from 'Linux' to 'Linux raid
> >> autodetect' after my last re-sync completed. Manually stopped and
> >> started the array. Things looked good, so I crossed my fingers and
> >> rebooted. The kernel found all the drives and all is happy here in
> >> Colorado.
> >
> >
> > Would it make sense for the raid code to somehow warn in the log when
> > a device in a raid set doesn't have "Linux raid autodetect" partition
> > type? If this was in "dmesg", would you have spotted the problem before?
> >
> As long as it is written where logwatch will see it, not recognize it,
> and report it... People who don't read their logwatch reports get no
> sympathy from me.
>
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 17:30 ` Mark Hahn
@ 2006-05-26 18:01 ` Craig Hollabaugh
0 siblings, 0 replies; 17+ messages in thread
From: Craig Hollabaugh @ 2006-05-26 18:01 UTC (permalink / raw)
To: Mark Hahn; +Cc: linux-raid
On Fri, 2006-05-26 at 13:30 -0400, Mark Hahn wrote:
> yes - the kernel traditionally doesn't, of its own accord, read files.
> most stuff under /etc are inputs to user-level tools that run during
> boot to instruct the kernel how to configure things. distros have,
> in the past, had boot-time scripts that would run mdadm and thus
> read your mdadm.conf (or the raid config files that predate mdadm...)
>
> so perhaps your observed change in behavior had to do with distro changes...
I agree. There must have been a distro change over the past 3 years
concerning the array build process. I seem to remember a great concern
of mine to store my mdadm.conf off-site, just in case my rootfs drive
died (which it did of course). I also never set the partition types to
Linux raid either. So there's been a couple changes over the years,
probably more.
I will say this. My 2 1TB 14 drive servers have been extremely reliable
for the past 3.5 years. Occasionally, I replace the power supply but
that's about it. Just for your information, a drive will drop out of the
array when the power supply starts to droop. When I have a drive
failure, I pull the drive, externally run a bad block test and replace
if necessary. If no errors, I replace the power supply and reinsert the
old drive back into the array and rebuild. This has happened about 5
times for my 2 servers over the past 3 years.
>
>
--
------------------------------------------------------------
Dr. Craig Hollabaugh, craig@hollabaugh.com, 970 240 0509
Author of Embedded Linux: Hardware, Software and Interfacing
www.embeddedlinuxinterfacing.com
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 17:06 ` Craig Hollabaugh
2006-05-26 17:30 ` Mark Hahn
@ 2006-05-26 18:38 ` Luca Berra
2006-05-26 19:37 ` Mark Hahn
1 sibling, 1 reply; 17+ messages in thread
From: Luca Berra @ 2006-05-26 18:38 UTC (permalink / raw)
To: linux-raid
On Fri, May 26, 2006 at 11:06:21AM -0600, Craig Hollabaugh wrote:
>On Fri, 2006-05-26 at 12:45 -0400, Mark Hahn wrote:
>> I think the current situation is good, since there is some danger of
>> going too far. for instance, testing each partition to see whether
>> it contains a valid superblock would be pretty crazy, right? requiring
>> either the "auto-assemble-me" partition type, or explicit partitions
>> given in a config file is a happy medium...
>>
>
>I created my array in 1/2003, don't know versions of kernel or mdadm I
>was using then.
>
>In my situation over the past few days.
> kernel 2.4.30 kicked non-fresh
> kernel 2.6.11.8 kicked non-fresh
> kernel 2.6.18.8 didn't mention anything, just skipped my 'linux'
>partitions
>
>These kernels auto-assemble prior to mounting /. So the kernel doesn't
>consult my
>/etc/mdadm/mdadm.conf file. Is this correct?
i strongly believe it is not correct to let kernel auto-assemble devices
kernel auto-assembly should be disable and activation should be handled
by mdadm only!
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 18:38 ` Luca Berra
@ 2006-05-26 19:37 ` Mark Hahn
2006-05-27 12:21 ` Luca Berra
0 siblings, 1 reply; 17+ messages in thread
From: Mark Hahn @ 2006-05-26 19:37 UTC (permalink / raw)
To: Luca Berra; +Cc: linux-raid
> i strongly believe it is not correct to let kernel auto-assemble devices
> kernel auto-assembly should be disable and activation should be handled
> by mdadm only!
it's a convenience/safety tradeoff, like so many other cases.
without kernel auto-assembly, it's somewhat more annoying to
boot onto MD raid, right? you are forced to put MD config stuff
into your initrd, etc.
I don't see why auto-assembly is such a bad thing. it means you
shouldn't leave 0xfd partitions sitting around, but that's OK,
since 0xfd means exactly and nothing but "please autoassemble this".
no worse than leaving inconsistent or erroneous stuff in your
mdadm.conf or /etc/rc.d/rc.sysinit.
the only argument I see against (kernel) auto-assembly is the
general principle of moving things out of the kernel where possible.
but that's not a hard/fast rule anyway, so...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 19:37 ` Mark Hahn
@ 2006-05-27 12:21 ` Luca Berra
0 siblings, 0 replies; 17+ messages in thread
From: Luca Berra @ 2006-05-27 12:21 UTC (permalink / raw)
To: linux-raid
On Fri, May 26, 2006 at 03:37:29PM -0400, Mark Hahn wrote:
>> i strongly believe it is not correct to let kernel auto-assemble devices
>> kernel auto-assembly should be disable and activation should be handled
>> by mdadm only!
>
>it's a convenience/safety tradeoff, like so many other cases.
>without kernel auto-assembly, it's somewhat more annoying to
>boot onto MD raid, right? you are forced to put MD config stuff
>into your initrd, etc.
yes, it is, but initrd are generated by scripts nowadays, so you wont
even notice.
>I don't see why auto-assembly is such a bad thing. it means you
please read the list archives, it has been explained to boredom
>the only argument I see against (kernel) auto-assembly is the
>general principle of moving things out of the kernel where possible.
>but that's not a hard/fast rule anyway, so...
please read the list archives, it has been explained to boredom
Regards,
L.
and please,
do not To: or Cc: me, i do actively read the list.
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 16:45 ` Mark Hahn
2006-05-26 17:06 ` Craig Hollabaugh
@ 2006-05-29 4:34 ` Neil Brown
1 sibling, 0 replies; 17+ messages in thread
From: Neil Brown @ 2006-05-29 4:34 UTC (permalink / raw)
To: Mark Hahn; +Cc: Craig Hollabaugh, linux-raid
On Friday May 26, hahn@physics.mcmaster.ca wrote:
> > I had no idea about this particular configuration requirement. None of
>
> just to be clear: it's not a requirement. if you want the very nice
> auto-assembling behavior, you need to designate the auto-assemblable
> partitions. but you can assemble "manually" without 0xfd partitions
> (even if that's in an initrd, for instance.)
>
> I think the current situation is good, since there is some danger of
> going too far. for instance, testing each partition to see whether
> it contains a valid superblock would be pretty crazy, right?
I'm curious: why exactly do you say that?
Doing the reads themselves cannot be a problem as the kernel already
reads the partition table from each devices. Reading superblocks is
no big deal.
If you don't like the idea of assembling everything that was found,
how is that different from.....
requiring
> either the "auto-assemble-me" partition type, or explicit partitions
> given in a config file is a happy medium...
assembling everything that was found which had an 'auto-assemble-me'
flag? That flag, in common usage, contains almost zero information
more than the existence of the raid superblock.
Am I missing something?
My opinion: the "auto-assemble-me" partition type is not a happy
medium. The superblock containing the hostname (as supported by
mdadm-2.5) is (I hope).
NeilBrown
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: RAID5 kicks non-fresh drives
2006-05-26 7:57 ` Mikael Abrahamsson
2006-05-26 14:11 ` Craig Hollabaugh
2006-05-26 17:32 ` Bill Davidsen
@ 2006-05-29 5:20 ` Neil Brown
2 siblings, 0 replies; 17+ messages in thread
From: Neil Brown @ 2006-05-29 5:20 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: Craig Hollabaugh, linux-raid
On Friday May 26, swmike@swm.pp.se wrote:
> On Thu, 25 May 2006, Craig Hollabaugh wrote:
>
> > That did it! I set the partition FS Types from 'Linux' to 'Linux raid
> > autodetect' after my last re-sync completed. Manually stopped and
> > started the array. Things looked good, so I crossed my fingers and
> > rebooted. The kernel found all the drives and all is happy here in
> > Colorado.
>
> Would it make sense for the raid code to somehow warn in the log when a
> device in a raid set doesn't have "Linux raid autodetect" partition type?
> If this was in "dmesg", would you have spotted the problem before?
Maybe. Unfortunately md doesn't really have direct access to
information on partition types. The way it gets access for
auto-detect is an ugly hack which I would rather not make any further
use of.
Maybe mdadm could be more helpful here.
e.g. when you create, assemble, or 'detail' an array it could report
any inconsistencies in the partition types, and when you --add
a device which is isn't a Raid-autodetect partition to an
array the currently comprises such partitions it could give a warning.
I had thought that 'libblkid' would help with that, but having looked
at the doco, it appears not
Maybe I use libparted... or maybe borrow code out of kpartx.
There don't seem to be any easy options ;-(
Thanks for the suggestion (and if anyone has some good partition
hacking code...)
NeilBrown
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-05-29 5:20 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-25 15:38 RAID5 kicks non-fresh drives Craig Hollabaugh
2006-05-25 21:18 ` Neil Brown
2006-05-25 21:39 ` Craig Hollabaugh
2006-05-25 22:30 ` Craig Hollabaugh
2006-05-26 7:57 ` Mikael Abrahamsson
2006-05-26 14:11 ` Craig Hollabaugh
2006-05-26 16:45 ` Mark Hahn
2006-05-26 17:06 ` Craig Hollabaugh
2006-05-26 17:30 ` Mark Hahn
2006-05-26 18:01 ` Craig Hollabaugh
2006-05-26 18:38 ` Luca Berra
2006-05-26 19:37 ` Mark Hahn
2006-05-27 12:21 ` Luca Berra
2006-05-29 4:34 ` Neil Brown
2006-05-26 17:32 ` Bill Davidsen
2006-05-26 17:49 ` Craig Hollabaugh
2006-05-29 5:20 ` Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).