From: nterry <nigel@nigelterry.net>
To: Justin Piszcz <jpiszcz@lucidpixels.com>, linux-raid@vger.kernel.org
Cc: Michal Soltys <soltys@ziu.info>
Subject: Re: Raid 5 Problem
Date: Sun, 14 Dec 2008 15:58:02 -0500 [thread overview]
Message-ID: <4945735A.6030909@nigelterry.net> (raw)
In-Reply-To: <alpine.DEB.1.10.0812141552380.27065@p34.internal.lan>
Justin Piszcz wrote:
>
>
> On Sun, 14 Dec 2008, nterry wrote:
>
>> Michal Soltys wrote:
>>> nterry wrote:
>>>> Hi. I hope someone can tell me what I have done wrong. I have a 4
>>>> disk Raid 5 array running on Fedora9. I've run this array for 2.5
>>>> years with no issues. I recently rebooted after upgrading to
>>>> Kernel 2.6.27.7. When I did this I found that only 3 of my disks
>>>> were in the array. When I examine the three active elements of the
>>>> array (/dev/sdd1, /dev/sde1, /dev/sdc1) they all show that the
>>>> array has 3 drives and one missing. When I examine the missing
>>>> drive it shows that all members of the array are present, which I
>>>> don't understand! When I try to add the missing drive back is says
>>>> the device is busy. Please see below and let me know what I need
>>>> to do to get this working again. Thanks Nigel:
>>>>
>>>> ==================================================================
>>>> [root@homepc ~]# cat /proc/mdstat
>>>> Personalities : [raid6] [raid5] [raid4]
>>>> md0 : active raid5 sdd1[0] sdc1[3] sde1[1]
>>>> 735334656 blocks level 5, 128k chunk, algorithm 2 [4/3] [UU_U]
>>>> md_d0 : inactive sdb[2](S)
>>>> 245117312 blocks
>>>> unused devices: <none>
>>>> [root@homepc ~]#
>>>
>>> For some reason, it looks like you have 2 raid arrays visible - md0
>>> and md_d0. The latter took sdb (not sdb1) as its component.
>>>
>>> sd{c,d,e}1 is in assembeld array (with appropriately updated
>>> superblocks), thus mdadm --examine calls show one device as removed,
>>> but sdb is part of another inactive array, and the superblock is
>>> untouched and shows "old" situation. Note that 0.9 superblock is
>>> stored at the end of the device (see md(4) for details), so its
>>> position could be valid for both sdb and sdb1.
>>>
>>> This might be an effect of --incremental assembly mode. Hard to tell
>>> more without seeing startup scripts, mdadm.conf, udev rules,
>>> partition layout... Did upgrade involve anything more besides kernel ?
>>>
>>> Stop both arrays, check mdadm.conf, assemble md0 manually (mdadm -A
>>> /dev/md0 /dev/sd{c,d,e}1 ), verify situation with mdadm -D. If
>>> everything looks sane, add /dev/sdb1 to the array. Still, w/o
>>> checking out startup stuff, it might happen again after reboot.
>>> Adding DEVICE /dev/sd[bcde]1 to mdadm.conf might help though.
>>>
>>> Wait a bit for other suggestions as well.
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>> I don't think the Kernel upgrade actually caused the problem. I
>> tried booting up on an older (2.6.27.5) kernel and that made no
>> difference. I checked the logs for anything else that might have
>> made a difference, but couldn't see anything that made any sense to
>> me. I did note that on an earlier update mdadm was upgraded:
>> Nov 26 17:08:32 Updated: mdadm-2.6.7.1-1.fc9.x86_64
>> and I did not reboot after that upgrade
>>
>> I included my mdadm.conf with the last email and it includes ARRAY
>> /dev/md0 level=raid5 num-devices=4
>> devices=/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1
>> My configuration is just vanilla Fedora9 with the mdadm.conf I sent
>>
>> I've never had a /dev/md_d0 array, so that must have been
>> automatically created. I may have had other devices and partitions
>> in /dev/md0 as I know I had several attempts at getting it working
>> 2.5 years ago, and I had other issues when Fedora changed device
>> naming, I think at FC7. There is only one partition on /dev/sdb, see
>> below:
>>
>> (parted) select /dev/sdb Using /dev/sdb
>> (parted) print Model: ATA Maxtor 6L250R0 (scsi)
>> Disk /dev/sdb: 251GB
>> Sector size (logical/physical): 512B/512B
>> Partition Table: msdos
>>
>> Number Start End Size Type File system Flags 1
>> 32.3kB 251GB 251GB primary boot, raid
>>
>> So it looks like something is creating the /dev/md_d0 and adding
>> /dev/sdb to it before /dev/md0 gets started.
>>
>> So I tried:
>> [root@homepc ~]# mdadm --stop /dev/md_d0
>> mdadm: stopped /dev/md_d0
>> [root@homepc ~]# mdadm --add /dev/md0 /dev/sdb1
>> mdadm: re-added /dev/sdb1
>> [root@homepc ~]# cat /proc/mdstat
>> Personalities : [raid6] [raid5] [raid4]
>> md0 : active raid5 sdb1[4] sdd1[0] sdc1[3] sde1[1]
>> 735334656 blocks level 5, 128k chunk, algorithm 2 [4/3] [UU_U]
>> [>....................] recovery = 0.1% (299936/245111552)
>> finish=81.6min speed=49989K/sec
>> unused devices: <none>
>> [root@homepc ~]#
>>
>> Great - All working. Then I rebooted and was back to square one with
>> only 3 drives in /dev/md0 and /dev/sdb in /dev/md_d0
>> So I am still not understanding
>> where /dev/md_d0 is coming from and although I know how to get things
>> working after a reboot, clearly this is not a long term solution...
>
> What does:
>
> mdadm --examine --scan
>
> Say?
>
> Are you using a kernel with an initrd+modules or is everything
> compiled in?
>
> Justin.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
[root@homepc ~]# mdadm --examine --scan
ARRAY /dev/md0 level=raid5 num-devices=2
UUID=c57d50aa:1b3bcabd:ab04d342:6049b3f1
spares=1
ARRAY /dev/md0 level=raid5 num-devices=4
UUID=50e3173e:b5d2bdb6:7db3576b:644409bb
spares=1
ARRAY /dev/md0 level=raid5 num-devices=4
UUID=50e3173e:b5d2bdb6:7db3576b:644409bb
spares=1
[root@homepc ~]#
I'm not sure I really know the answer to your second question. I'm
using a regular Fedora9 kernel, so I think that is initrd+modules
[root@homepc ~]# uname -a
Linux homepc.nigelterry.net 2.6.27.7-53.fc9.x86_64 #1 SMP Thu Nov 27
02:05:02 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
[root@homepc ~]#
next prev parent reply other threads:[~2008-12-14 20:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-14 13:41 Raid 5 Problem nterry
2008-12-14 15:34 ` Michal Soltys
2008-12-14 20:41 ` nterry
2008-12-14 20:53 ` Justin Piszcz
2008-12-14 20:58 ` nterry [this message]
2008-12-14 21:03 ` Justin Piszcz
2008-12-14 21:08 ` Nigel J. Terry
2008-12-14 22:55 ` Michal Soltys
2008-12-14 21:14 ` Michal Soltys
2008-12-14 21:34 ` nterry
2008-12-14 22:02 ` Michal Soltys
2008-12-15 21:50 ` Neil Brown
2008-12-15 23:07 ` nterry
2008-12-16 20:39 ` nterry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4945735A.6030909@nigelterry.net \
--to=nigel@nigelterry.net \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-raid@vger.kernel.org \
--cc=soltys@ziu.info \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.