* RAID5 not being reassembled correctly after device swap
@ 2007-07-01 21:21 Michael Frotscher
2007-07-01 22:12 ` Neil Brown
2007-07-03 17:16 ` Michael Frotscher
0 siblings, 2 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-01 21:21 UTC (permalink / raw)
To: linux-raid
Hello RAID-Experts,
I have three RAID5 consisting of different partitions on 3 disks (Debian
stable) running the root-filesystem on a md (/boot is a separate non-raid
partition) which is running rather nicely. For convenience I plugged all
drives into the first ide controller making them hda, hdb and hdc. So far, so
good. The partitions are flagged "fd", i.e. Linux raid autodetect.
As I have another builtin-ide-controller onboard, I'd like to distribute the
disks for performance reasons, moving hdb to hde and hdc to hdg. The arrays
would then consist of drives hda, hde and hdg.
This should not be a problem, as the arrays should assemble themselves using
the superblocks on the partitions, shouldn't it?
However, when I switch one drive (hdc), the array starts degraded with two
drives present because it is still looking for hdc, which of course now is
hdg. This shouldn't be happening.
Well, then I re-added hdg to the degraded array, which went well and the array
rebuilded itself. I now had healthy arrays consisting of hda, hdb and hdg.
But after a reboot the array was degraded again and the system wanted its hdc
drive.
And yes, I edited /boot/grub/device.map and changed hdc to hdg, so that can't
be the reason.
I seem to be missing something here, but what is it?
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-01 21:21 RAID5 not being reassembled correctly after device swap Michael Frotscher
@ 2007-07-01 22:12 ` Neil Brown
2007-07-02 6:35 ` Michael Frotscher
2007-07-03 17:16 ` Michael Frotscher
1 sibling, 1 reply; 12+ messages in thread
From: Neil Brown @ 2007-07-01 22:12 UTC (permalink / raw)
To: Michael Frotscher; +Cc: linux-raid
On Sunday July 1, infomails@tronserver.dyndns.org wrote:
> Hello RAID-Experts,
>
> I have three RAID5 consisting of different partitions on 3 disks (Debian
> stable) running the root-filesystem on a md (/boot is a separate non-raid
> partition) which is running rather nicely. For convenience I plugged all
> drives into the first ide controller making them hda, hdb and hdc. So far, so
> good. The partitions are flagged "fd", i.e. Linux raid autodetect.
>
> As I have another builtin-ide-controller onboard, I'd like to distribute the
> disks for performance reasons, moving hdb to hde and hdc to hdg. The arrays
> would then consist of drives hda, hde and hdg.
>
> This should not be a problem, as the arrays should assemble themselves using
> the superblocks on the partitions, shouldn't it?
>
> However, when I switch one drive (hdc), the array starts degraded with two
> drives present because it is still looking for hdc, which of course now is
> hdg. This shouldn't be happening.
>
> Well, then I re-added hdg to the degraded array, which went well and the array
> rebuilded itself. I now had healthy arrays consisting of hda, hdb and hdg.
> But after a reboot the array was degraded again and the system wanted its hdc
> drive.
>
> And yes, I edited /boot/grub/device.map and changed hdc to hdg, so that can't
> be the reason.
>
> I seem to be missing something here, but what is it?
Kernel logs from the boot would help here.
Maybe /etc/mdadm/mdadm.conf lists "device=...." where it shouldn't.
Maybe the other IDE controller uses a module that it loaded late.
Logs would help.
NeilBrown
<
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-01 22:12 ` Neil Brown
@ 2007-07-02 6:35 ` Michael Frotscher
2007-07-02 6:50 ` Michael Frotscher
0 siblings, 1 reply; 12+ messages in thread
From: Michael Frotscher @ 2007-07-02 6:35 UTC (permalink / raw)
To: linux-raid
On Monday 02 July 2007 00:12:14 Neil Brown wrote:
> Kernel logs from the boot would help here.
> Logs would help.
Sure. The interesting part from dmesg is this:
hdg: max request size: 512KiB
hdg: 398297088 sectors (203928 MB) w/8192KiB Cache, CHS=24792/255/63,
UDMA(100)
hdg: cache flushes supported
hdg: hdg1 hdg2 hdg3 hdg4
hda: max request size: 512KiB
hda: 390721968 sectors (200049 MB) w/8192KiB Cache, CHS=24321/255/63,
UDMA(100)
hda: cache flushes supported
hda: hda1 hda2 hda3 hda4
hdb: max request size: 512KiB
hdb: 490234752 sectors (251000 MB) w/7936KiB Cache, CHS=30515/255/63,
UDMA(100)
hdb: cache flushes supported
hdb: hdb1 hdb2 hdb3 hdb4
md: md3 stopped.
md: bind<hdb3>
md: bind<hda3>
raid5: device hda3 operational as raid disk 0
raid5: device hdb3 operational as raid disk 1
raid5: allocated 3163kB for md3
raid5: raid level 5 set md3 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, o:1, dev:hda3
disk 1, o:1, dev:hdb3
What I really don't understand is the output of /proc/mdstat after a reboot:
Personalities : [raid6] [raid5] [raid4]
md4 : active raid5 hdg4[1] hda4[2]
368643328 blocks level 5, 4k chunk, algorithm 2 [3/2] [_UU]
md2 : active raid5 hda2[0] hdg2[2]
1027968 blocks level 5, 4k chunk, algorithm 2 [3/2] [U_U]
md3 : active raid5 hda3[0] hdb3[1]
20980736 blocks level 5, 4k chunk, algorithm 2 [3/2] [UU_]
All arrays are degraded, but different disks are missing. md3 (the root
partition) is missing its hdg, as the logfile tells. md2 and md4 are now
missing its hdb:
md: md2 stopped.
md: bind<hdg2>
md: bind<hda2>
raid5: device hda2 operational as raid disk 0
raid5: device hdg2 operational as raid disk 2
raid5: allocated 3163kB for md2
raid5: raid level 5 set md2 active with 2 out of 3 devices, algorithm 2
Btw., is that significant that the order is different? In md4, the hdg-disk is
raid-disk 1, whereas it is raid-disk 2 in md2.
> Maybe /etc/mdadm/mdadm.conf lists "device=...." where it shouldn't.
Should be irrelevant, as the root-fs, where mdadm.conf resides, is on a raid
itself.
> Maybe the other IDE controller uses a module that it loaded late.
Hmm, I'd need to check that after I rebuild the arrays. Maybe the other
IDE-controller is not in the initrd. That wouldn't explain the missing hdb,
though.
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: RAID5 not being reassembled correctly after device swap
2007-07-02 6:35 ` Michael Frotscher
@ 2007-07-02 6:50 ` Michael Frotscher
0 siblings, 0 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-02 6:50 UTC (permalink / raw)
To: linux-raid
On Monday 02 July 2007 08:35:18 Michael Frotscher wrote:
> Hmm, I'd need to check that after I rebuild the arrays. Maybe the other
> IDE-controller is not in the initrd.
No, although this sounded like a good idea. The IDE controller is initialized
before the assembly of the arrays and even including its driver explicitly in
the initrd results in a initrd of the same size as before, so it had been in
there all along.
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-01 21:21 RAID5 not being reassembled correctly after device swap Michael Frotscher
2007-07-01 22:12 ` Neil Brown
@ 2007-07-03 17:16 ` Michael Frotscher
2007-07-03 17:22 ` Patrik Jonsson
2007-07-03 18:43 ` David Greaves
1 sibling, 2 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-03 17:16 UTC (permalink / raw)
To: linux-raid
Hello all,
I guess you can say that I'm at my wit's end. I really don't get it. An RAID
array is suppose to recognize its members purely by its uuid, isn't it? So
technically, I can remove a drive from one bus, reconnect it to another
giving it a new device name and the array should not even need to sync.
Somehow it doesn't. Somehow the array remembers of which devices it's supposed
to be assembled and boots in degraded mode, funnily not always missing the
swapped drive.
Suppose I lose a whole ide-controller and want to restart the box with a
seconday controller? That one would surely not have the devices hda through
hdd and my array would refuse to start.
Does anyone have a suggestion of what I can try? Ok, the array runs fine as
long as it is connected to its original bus, but I really don't want to take
chances here.
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 17:16 ` Michael Frotscher
@ 2007-07-03 17:22 ` Patrik Jonsson
2007-07-03 18:43 ` David Greaves
1 sibling, 0 replies; 12+ messages in thread
From: Patrik Jonsson @ 2007-07-03 17:22 UTC (permalink / raw)
To: Michael Frotscher; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]
Michael Frotscher wrote:
> Hello all,
>
> I guess you can say that I'm at my wit's end. I really don't get it. An RAID
> array is suppose to recognize its members purely by its uuid, isn't it? So
> technically, I can remove a drive from one bus, reconnect it to another
> giving it a new device name and the array should not even need to sync.
>
> Somehow it doesn't. Somehow the array remembers of which devices it's supposed
> to be assembled and boots in degraded mode, funnily not always missing the
> swapped drive.
>
> Suppose I lose a whole ide-controller and want to restart the box with a
> seconday controller? That one would surely not have the devices hda through
> hdd and my array would refuse to start.
>
> Does anyone have a suggestion of what I can try? Ok, the array runs fine as
> long as it is connected to its original bus, but I really don't want to take
> chances here.
>
Funny, I did just this on my 10-disk raid5. 4 drives were moved from an
onboard sata controller to an Areca raid controller, and the array
didn't care. That was of course while the machine was down. If you fail
out a bunch of drives by having the controller go bad, that's probably
different.
cheers,
/Patrik
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 17:16 ` Michael Frotscher
2007-07-03 17:22 ` Patrik Jonsson
@ 2007-07-03 18:43 ` David Greaves
2007-07-03 19:20 ` Michael Frotscher
2007-07-03 19:29 ` Michael Frotscher
1 sibling, 2 replies; 12+ messages in thread
From: David Greaves @ 2007-07-03 18:43 UTC (permalink / raw)
To: Michael Frotscher; +Cc: linux-raid
Michael Frotscher wrote:
> Hello all,
>
> I guess you can say that I'm at my wit's end. I really don't get it. An RAID
> array is suppose to recognize its members purely by its uuid, isn't it? So
> technically, I can remove a drive from one bus, reconnect it to another
> giving it a new device name and the array should not even need to sync.
>
> Somehow it doesn't. Somehow the array remembers of which devices it's supposed
> to be assembled and boots in degraded mode, funnily not always missing the
> swapped drive.
>
> Suppose I lose a whole ide-controller and want to restart the box with a
> seconday controller? That one would surely not have the devices hda through
> hdd and my array would refuse to start.
>
> Does anyone have a suggestion of what I can try? Ok, the array runs fine as
> long as it is connected to its original bus, but I really don't want to take
> chances here.
Do you have a mdman.conf file that specifies/limits partitions to search?
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 18:43 ` David Greaves
@ 2007-07-03 19:20 ` Michael Frotscher
2007-07-03 19:29 ` Michael Frotscher
1 sibling, 0 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-03 19:20 UTC (permalink / raw)
To: linux-raid
Hi David,
> Do you have a mdman.conf file that specifies/limits partitions to search?
Just the usual "DEVICE partitons" followed by the ARRAY-lines. However, I
don't think it's the mdadm.conf, rather the superblocks. Right now my main
worry is my array which holds the root filesystem. The others I was able to
resurrect (disks at their original ide ports) using the --update-option when
assembling. As I cannot do that with the root array (didn't work when I
booted off a CD), I'm a bit stuck.
When the system starts, it does not even bother to look for a third array
component but starts the array degraded. I can then "mdadm -a" the third disk
back into the array, it synchronizes and everything looks good. The same
thing at the next boot.
Isn't there an option which updates all superblocks on an assembled array
saying: you partitions are an array and stay an array until your superblock
is erased or hell freezes over, whichever happens first. Amen.
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 18:43 ` David Greaves
2007-07-03 19:20 ` Michael Frotscher
@ 2007-07-03 19:29 ` Michael Frotscher
2007-07-04 8:45 ` David Greaves
2007-07-04 13:35 ` Bill Davidsen
1 sibling, 2 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-03 19:29 UTC (permalink / raw)
To: linux-raid
I forgot, in case it's of any help.
mdadm -D gives after reassembly:
/dev/md3:
Version : 00.90.03
Creation Time : Sun Jan 14 21:17:53 2007
Raid Level : raid5
Array Size : 20980736 (20.01 GiB 21.48 GB)
Device Size : 10490368 (10.00 GiB 10.74 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Tue Jul 3 21:21:53 2007
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 4K
UUID : 36bbe21d:f49e8b5d:f504154c:a6f12a51
Events : 0.6995604
Number Major Minor RaidDevice State
0 3 3 0 active sync /dev/hda3
1 3 67 1 active sync /dev/hdb3
2 22 3 2 active sync /dev/hdc3
and after the next boot:
/dev/md3:
Version : 00.90.03
Creation Time : Sun Jan 14 21:17:53 2007
Raid Level : raid5
Array Size : 20980736 (20.01 GiB 21.48 GB)
Device Size : 10490368 (10.00 GiB 10.74 GB)
Raid Devices : 3
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Tue Jul 3 21:27:08 2007
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 4K
UUID : 36bbe21d:f49e8b5d:f504154c:a6f12a51
Events : 0.6995718
Number Major Minor RaidDevice State
0 3 3 0 active sync /dev/hda3
1 3 67 1 active sync /dev/hdb3
2 0 0 2 removed
Any ideas on why the drive keeps being removed?
--
YT,
Michael
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 19:29 ` Michael Frotscher
@ 2007-07-04 8:45 ` David Greaves
2007-07-04 16:31 ` Michael Frotscher
2007-07-04 13:35 ` Bill Davidsen
1 sibling, 1 reply; 12+ messages in thread
From: David Greaves @ 2007-07-04 8:45 UTC (permalink / raw)
To: Michael Frotscher; +Cc: linux-raid
Michael Frotscher wrote:
> I forgot, in case it's of any help.
Also do
mdadm --examine /dev/hd[abc]3
David
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-04 8:45 ` David Greaves
@ 2007-07-04 16:31 ` Michael Frotscher
0 siblings, 0 replies; 12+ messages in thread
From: Michael Frotscher @ 2007-07-04 16:31 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]
On Wednesday 04 July 2007 10:45:22 David Greaves wrote:
> mdadm --examine /dev/hd[abc]3
I'll attach that as a file as it's quite lenghty. It is of the reassembled
array - if it helps, I can reboot again and provide the same with the
degraded array that exists after reboot.
> And could you share an "fdisk -l" output
fdisk shows nothing unusual:
Disk /dev/hdc: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hdc1 1 4 32098+ 83 Linux
/dev/hdc2 5 68 514080 fd Linux raid autodetect
/dev/hdc3 69 1374 10490445 fd Linux raid autodetect
/dev/hdc4 1375 24321 184321777+ fd Linux raid autodetect
yes, the last partition does not extend to the end of the drive due to the
fact that hda is only a 200GB drive:
Disk /dev/hda: 200.0 GB, 200049647616 bytes
255 heads, 63 sectors/track, 24321 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 4 32098+ 83 Linux
/dev/hda2 5 68 514080 fd Linux raid autodetect
/dev/hda3 69 1374 10490445 fd Linux raid autodetect
/dev/hda4 1375 24321 184321777+ fd Linux raid autodetect
Thanks for the help everyone!
--
YT,
Michael
[-- Attachment #2: md3-array.txt --]
[-- Type: text/plain, Size: 2772 bytes --]
/dev/hda3:
Magic : a92b4efc
Version : 00.90.00
UUID : 36bbe21d:f49e8b5d:f504154c:a6f12a51
Creation Time : Sun Jan 14 21:17:53 2007
Raid Level : raid5
Device Size : 10490368 (10.00 GiB 10.74 GB)
Array Size : 20980736 (20.01 GiB 21.48 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Update Time : Wed Jul 4 18:24:59 2007
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : fe26fd74 - correct
Events : 0.6996556
Layout : left-symmetric
Chunk Size : 4K
Number Major Minor RaidDevice State
this 0 3 3 0 active sync /dev/hda3
0 0 3 3 0 active sync /dev/hda3
1 1 3 67 1 active sync /dev/hdb3
2 2 22 3 2 active sync /dev/hdc3
/dev/hdb3:
Magic : a92b4efc
Version : 00.90.00
UUID : 36bbe21d:f49e8b5d:f504154c:a6f12a51
Creation Time : Sun Jan 14 21:17:53 2007
Raid Level : raid5
Device Size : 10490368 (10.00 GiB 10.74 GB)
Array Size : 20980736 (20.01 GiB 21.48 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Update Time : Wed Jul 4 18:25:02 2007
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : fe26fdb9 - correct
Events : 0.6996556
Layout : left-symmetric
Chunk Size : 4K
Number Major Minor RaidDevice State
this 1 3 67 1 active sync /dev/hdb3
0 0 3 3 0 active sync /dev/hda3
1 1 3 67 1 active sync /dev/hdb3
2 2 22 3 2 active sync /dev/hdc3
/dev/hdc3:
Magic : a92b4efc
Version : 00.90.00
UUID : 36bbe21d:f49e8b5d:f504154c:a6f12a51
Creation Time : Sun Jan 14 21:17:53 2007
Raid Level : raid5
Device Size : 10490368 (10.00 GiB 10.74 GB)
Array Size : 20980736 (20.01 GiB 21.48 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Update Time : Wed Jul 4 18:25:02 2007
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : fe26fd8e - correct
Events : 0.6996556
Layout : left-symmetric
Chunk Size : 4K
Number Major Minor RaidDevice State
this 2 22 3 2 active sync /dev/hdc3
0 0 3 3 0 active sync /dev/hda3
1 1 3 67 1 active sync /dev/hdb3
2 2 22 3 2 active sync /dev/hdc3
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: RAID5 not being reassembled correctly after device swap
2007-07-03 19:29 ` Michael Frotscher
2007-07-04 8:45 ` David Greaves
@ 2007-07-04 13:35 ` Bill Davidsen
1 sibling, 0 replies; 12+ messages in thread
From: Bill Davidsen @ 2007-07-04 13:35 UTC (permalink / raw)
To: Michael Frotscher; +Cc: linux-raid
Michael Frotscher wrote:
> I forgot, in case it's of any help.
> mdadm -D gives after reassembly:
>
> [snip]
>
> Any ideas on why the drive keeps being removed?
>
Do you see anything in dmesg which would indicate an error on the drive?
And could you share an "fdisk -l" output so we can see what the kernel
thinks is on the drive? My guess is that for some reason the device is
being considered unread, has the wrong partition type, or was low level
formatted on a Monday. Okay, that last is unlikely ;-)
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2007-07-04 16:31 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-01 21:21 RAID5 not being reassembled correctly after device swap Michael Frotscher
2007-07-01 22:12 ` Neil Brown
2007-07-02 6:35 ` Michael Frotscher
2007-07-02 6:50 ` Michael Frotscher
2007-07-03 17:16 ` Michael Frotscher
2007-07-03 17:22 ` Patrik Jonsson
2007-07-03 18:43 ` David Greaves
2007-07-03 19:20 ` Michael Frotscher
2007-07-03 19:29 ` Michael Frotscher
2007-07-04 8:45 ` David Greaves
2007-07-04 16:31 ` Michael Frotscher
2007-07-04 13:35 ` Bill Davidsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).