* Raid5 & Debian Yaird Woes
@ 2006-02-03 1:29 Lewis Shobbrook
2006-02-03 2:13 ` dean gaudet
0 siblings, 1 reply; 11+ messages in thread
From: Lewis Shobbrook @ 2006-02-03 1:29 UTC (permalink / raw)
To: linux-raid, dr
Hi All,
I'm trying to get my head around the way that the new debian initrd system
"yaird" and mdadm.conf interact.
While running raid5 with yaird, I've discovered that if I replace or remove a
healthy drive, without manually using mdadm --set-faulty, the system will not
reboot. I get startup messages stating waiting X seconds for /dev/sdc,
eventually dropping me into a useless (for raid purposes) maintenance shell.
If I continue to boot via use of 'ctrl D', the system kernel panics, telling
me in has 2/3 members but needs all 3. This seriously impacts the benefit of
using raid5.
Problems also occurs if the disk is replaced, and the raid reconstructed
(using an alternate kernel initrd), somehow the new replacement drive is set
as faulty again, during startup ...resulting in the failure described above,
unless I first create a fresh yaird initrd.img via re-installation of the
kernel.deb prior to the system restart.
My mdadm.conf (I never needed to use at all previous to the yaird system) is
as follows...
ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2
auto=yes
ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
UUID=a3452240:a1578a31:737679af:58f53690
DEVICE partitions
The yaird documentation recommended at the use of at least auto=md, but the
use of results in errors (auto=md unknown something or other) that cause
kernel installation to fail.
Hoping someone can ease my pain here?
Cheers,
Lewis
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-03 1:29 Raid5 & Debian Yaird Woes Lewis Shobbrook
@ 2006-02-03 2:13 ` dean gaudet
2006-02-03 2:14 ` Lewis Shobbrook
2006-02-03 3:02 ` dean gaudet
0 siblings, 2 replies; 11+ messages in thread
From: dean gaudet @ 2006-02-03 2:13 UTC (permalink / raw)
To: Lewis Shobbrook; +Cc: linux-raid, dr
i've never looked at yaird in detail -- but you can probably use
initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will
use whichever one of those is installed. i know that initramfs-tools uses
mdrun to start the root partition based on its UUID -- and so it should
work fine (to get root mounted) even without dorking around with
mdadm.conf.
but if you want to stick with yaird:
On Fri, 3 Feb 2006, Lewis Shobbrook wrote:
> My mdadm.conf (I never needed to use at all previous to the yaird system) is
> as follows...
> ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2
> auto=yes
> ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
> UUID=a3452240:a1578a31:737679af:58f53690
> DEVICE partitions
some wrapping occured there i'm guessing...
you might be a lot happier if your /dev/md0 also specified the UUID rather
than the individual devices. this is probably the source of your
troubles.
you can get the UUID by doing "mdadm --examine /dev/sda2".
or you can try: mdadm --examine --scan --brief ... just prepend "DEVICE
partitions" in front of that and you should be happy.
-dean
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-03 2:13 ` dean gaudet
@ 2006-02-03 2:14 ` Lewis Shobbrook
2006-02-03 3:02 ` dean gaudet
1 sibling, 0 replies; 11+ messages in thread
From: Lewis Shobbrook @ 2006-02-03 2:14 UTC (permalink / raw)
To: dean gaudet, linux-raid
On Friday 03 February 2006 1:13 pm, you wrote:
Thanks Dean,
I'll try this out...
> i've never looked at yaird in detail -- but you can probably use
> initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will
> use whichever one of those is installed. i know that initramfs-tools uses
> mdrun to start the root partition based on its UUID -- and so it should
> work fine (to get root mounted) even without dorking around with
> mdadm.conf.
>
> but if you want to stick with yaird:
>
> On Fri, 3 Feb 2006, Lewis Shobbrook wrote:
> > My mdadm.conf (I never needed to use at all previous to the yaird system)
> > is as follows...
> > ARRAY /dev/md0 level=raid1 num-devices=3
> > devices=/dev/sda2,/dev/sdb2,/dev/sdc2 auto=yes
> > ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
> > UUID=a3452240:a1578a31:737679af:58f53690
> > DEVICE partitions
>
> some wrapping occured there i'm guessing...
>
> you might be a lot happier if your /dev/md0 also specified the UUID rather
> than the individual devices. this is probably the source of your
> troubles.
Seems a bit confusing and fickle of yaird that all md devices must follow the
uuid syntax in mdadm,conf.
How do you expect that this would effect the detection of /dev/md1, where all
the uuid on all components are intact, and /dev/md0 has the 'non-uuid'
syntax?
When yaird first arrived (did not specifically install it just a
dist-upgrade), I had initial problems with the boot sequence where the
root /dev/md0 wasn't starting, despite being able to manually start it from
the recovery console. Specifying the devices in mdadm.conf was the initial
fix. I'd never found the need to use mdadm.conf at all previously.
I can't really try this til I get home, if the machine doesn't come back up my
wife will have no MythTV playschool episodes for the rugrats.
I'll let you know how it goes.
Cheers,
Lewis
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-03 2:13 ` dean gaudet
2006-02-03 2:14 ` Lewis Shobbrook
@ 2006-02-03 3:02 ` dean gaudet
2006-02-03 22:58 ` Lewis Shobbrook
1 sibling, 1 reply; 11+ messages in thread
From: dean gaudet @ 2006-02-03 3:02 UTC (permalink / raw)
To: Lewis Shobbrook; +Cc: linux-raid, dr
On Thu, 2 Feb 2006, dean gaudet wrote:
> i've never looked at yaird in detail -- but you can probably use
> initramfs-tools instead of yaird...
i take it all back... i just tried initramfs-tools and it failed to boot
my system properly... whereas yaird almost got everything right.
the main thing i'd say yaird is doing wrong is that it is specifying the
root raid devices explicitly rather than allowing mdadm to scan the
partitions list and assemble by UUID...
maybe try the patch below on your yaird configuration and then run:
dpkg-reconfigure linux-image-`uname -r`
which will rebuild your initrd with this change... then see if it survives
your boot testing.
-dean
p.s. this patch has been submitted to debian bugdb...
--- /etc/yaird/Templates.cfg 2006/02/03 02:44:49 1.1
+++ /etc/yaird/Templates.cfg 2006/02/03 02:46:15
@@ -299,8 +299,7 @@
SCRIPT "/init"
BEGIN
!mknod <TMPL_VAR NAME=target> b <TMPL_VAR NAME=major> <TMPL_VAR NAME=minor>
- !mdadm --assemble <TMPL_VAR NAME=target> --uuid <TMPL_VAR NAME=uuid> \
- ! <TMPL_LOOP NAME=components> <TMPL_VAR NAME=dev></TMPL_LOOP>
+ !mdadm -Ac partitions <TMPL_VAR NAME=target> --uuid <TMPL_VAR NAME=uuid>
END SCRIPT
END TEMPLATE
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-03 3:02 ` dean gaudet
@ 2006-02-03 22:58 ` Lewis Shobbrook
2006-02-04 0:22 ` dean gaudet
0 siblings, 1 reply; 11+ messages in thread
From: Lewis Shobbrook @ 2006-02-03 22:58 UTC (permalink / raw)
To: dean gaudet; +Cc: linux-raid, dr
On Friday 03 February 2006 2:02 pm, you wrote:
Hi Dean,
Thanks for the suggestions.
> On Thu, 2 Feb 2006, dean gaudet wrote:
> > i've never looked at yaird in detail -- but you can probably use
> > initramfs-tools instead of yaird...
>
> i take it all back... i just tried initramfs-tools and it failed to boot
> my system properly... whereas yaird almost got everything right.
>
> the main thing i'd say yaird is doing wrong is that it is specifying the
> root raid devices explicitly rather than allowing mdadm to scan the
> partitions list and assemble by UUID...
>
> maybe try the patch below on your yaird configuration and then run:
>
> dpkg-reconfigure linux-image-`uname -r`
>
> which will rebuild your initrd with this change... then see if it survives
> your boot testing.
>
> -dean
>
> p.s. this patch has been submitted to debian bugdb...
>
> --- /etc/yaird/Templates.cfg 2006/02/03 02:44:49 1.1
> +++ /etc/yaird/Templates.cfg 2006/02/03 02:46:15
> @@ -299,8 +299,7 @@
> SCRIPT "/init"
> BEGIN
> !mknod <TMPL_VAR NAME=target> b <TMPL_VAR NAME=major> <TMPL_VAR
> NAME=minor> - !mdadm --assemble <TMPL_VAR NAME=target> --uuid <TMPL_VAR
> NAME=uuid> \ - ! <TMPL_LOOP NAME=components> <TMPL_VAR
> NAME=dev></TMPL_LOOP>
> + !mdadm -Ac partitions <TMPL_VAR NAME=target> --uuid <TMPL_VAR
> NAME=uuid> END SCRIPT
> END TEMPLATE
I applied the patch as well as modified the mdadm.conf, as you suggested in
the previous email, and the system restarted without problem!
A positive step forward.
Removing a drive however, results in a disruption to the boot process
requiring user input (ctrl D) in the admin console to kick things off again.
Notably it works from this point, where previously I had encountered kernel
panic.
Is there any way to avoid this requirement for input, so that the system skips
the missing drive as the raid/initrd system did previously?
If you have a system restart after a power outage combined with a degraded
array, the server would be unacceptably kept offline until manual
intervention occurred.
Cheers & Thanks,
Lewis
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-03 22:58 ` Lewis Shobbrook
@ 2006-02-04 0:22 ` dean gaudet
2006-02-04 8:35 ` Jonas Smedegaard
2006-02-04 22:07 ` Lewis Shobbrook
0 siblings, 2 replies; 11+ messages in thread
From: dean gaudet @ 2006-02-04 0:22 UTC (permalink / raw)
To: Lewis Shobbrook; +Cc: linux-raid, dr
On Sat, 4 Feb 2006, Lewis Shobbrook wrote:
> Is there any way to avoid this requirement for input, so that the system skips
> the missing drive as the raid/initrd system did previously?
what boot errors are you getting before it drops you to the root password
prompt?
is it trying to fsck some filesystem it doesn't have access to?
-dean
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-04 0:22 ` dean gaudet
@ 2006-02-04 8:35 ` Jonas Smedegaard
2006-02-04 22:07 ` Lewis Shobbrook
1 sibling, 0 replies; 11+ messages in thread
From: Jonas Smedegaard @ 2006-02-04 8:35 UTC (permalink / raw)
To: dean gaudet; +Cc: mylists, linux-raid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This thread is all very relevant.
But please cc yaird-devel@lists.alioth.debian.org rather than me
privately.
Regards,
- Jonas
- --
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
- Enden er nær: http://www.shibumi.org/eoti.htm
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
iD8DBQFD5Gc/n7DbMsAkQLgRAq9XAKCTicLEnlz6iK5USZAVH0oD6bCzeQCgh1tE
jgtJm7dsf0b5oKdx0JWnnpk=
=4g1e
-----END PGP SIGNATURE-----
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-04 0:22 ` dean gaudet
2006-02-04 8:35 ` Jonas Smedegaard
@ 2006-02-04 22:07 ` Lewis Shobbrook
2006-02-07 2:52 ` dean gaudet
2006-04-24 15:13 ` Jonas Smedegaard
1 sibling, 2 replies; 11+ messages in thread
From: Lewis Shobbrook @ 2006-02-04 22:07 UTC (permalink / raw)
To: dean gaudet; +Cc: linux-raid, dr
On Saturday 04 February 2006 11:22 am, you wrote:
> On Sat, 4 Feb 2006, Lewis Shobbrook wrote:
> > Is there any way to avoid this requirement for input, so that the system
> > skips the missing drive as the raid/initrd system did previously?
>
> what boot errors are you getting before it drops you to the root password
> prompt?
Basically it just states waiting X seconds for /dev/sdx3 (corresponding to the
missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a
recovery console, no root pwd prompt.
It will only occur if the partition is completely missing, such as a
replacement disk with a blank partition table, or a completely missing/failed
drive.
> is it trying to fsck some filesystem it doesn't have access to?
No fsck seen for bad extX partitions etc.
Cheers,
Lewis
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-04 22:07 ` Lewis Shobbrook
@ 2006-02-07 2:52 ` dean gaudet
2006-04-24 15:13 ` Jonas Smedegaard
1 sibling, 0 replies; 11+ messages in thread
From: dean gaudet @ 2006-02-07 2:52 UTC (permalink / raw)
To: Lewis Shobbrook; +Cc: linux-raid
On Sun, 5 Feb 2006, Lewis Shobbrook wrote:
> On Saturday 04 February 2006 11:22 am, you wrote:
> > On Sat, 4 Feb 2006, Lewis Shobbrook wrote:
> > > Is there any way to avoid this requirement for input, so that the system
> > > skips the missing drive as the raid/initrd system did previously?
> >
> > what boot errors are you getting before it drops you to the root password
> > prompt?
>
> Basically it just states waiting X seconds for /dev/sdx3 (corresponding to the
> missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a
> recovery console, no root pwd prompt.
> It will only occur if the partition is completely missing, such as a
> replacement disk with a blank partition table, or a completely missing/failed
> drive.
> > is it trying to fsck some filesystem it doesn't have access to?
>
> No fsck seen for bad extX partitions etc.
try something like this...
cd /tmp
mkdir t
cd t
zcat /boot/initrd.img-`uname -r` | cpio -i
grep -r sd.3 .
that should show us what script is directly accessing /dev/sdx3 ... maybe
there's something more we can do about it.
i did find a possible deficiency with the patch i posted... looking more
closely at my yaird /init i see this:
mkbdev '/dev/sdb' 'sdb'
mkbdev '/dev/sdb4' 'sdb/sdb4'
mkbdev '/dev/sda' 'sda'
mkbdev '/dev/sda4' 'sda/sda4'
and i think that means that "mdadm -Ac partitions" will fail if one of my
root disks ends up somewhere other than sda or sdb... because the device
nodes won't exist.
i suspect i should update the patch to use mdrun instead of "mdadm -Ac
partitions"... because mdrun will create temporary device nodes for
everything in /proc/partitions in order to find all the possible raid
pieces.
-dean
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-02-04 22:07 ` Lewis Shobbrook
2006-02-07 2:52 ` dean gaudet
@ 2006-04-24 15:13 ` Jonas Smedegaard
2006-04-24 15:20 ` Jonas Smedegaard
1 sibling, 1 reply; 11+ messages in thread
From: Jonas Smedegaard @ 2006-04-24 15:13 UTC (permalink / raw)
To: Lewis Shobbrook; +Cc: dean, linux-raid
[-- Attachment #1: Type: text/plain, Size: 547 bytes --]
On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote:
> Basically it just states waiting X seconds
Please post in public rather than to me privately.
If this debate is related to a bug already filed against the Debian
package of yaird then cc that bugreport: <bug number>@bugs.debian.org -
and if not then please file a bugreport.
Thanks in advance,
- Jonas
--
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
- Enden er n_r: http://www.shibumi.org/eoti.htm
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Raid5 & Debian Yaird Woes
2006-04-24 15:13 ` Jonas Smedegaard
@ 2006-04-24 15:20 ` Jonas Smedegaard
0 siblings, 0 replies; 11+ messages in thread
From: Jonas Smedegaard @ 2006-04-24 15:20 UTC (permalink / raw)
To: mylists, dean, linux-raid
[-- Attachment #1: Type: text/plain, Size: 659 bytes --]
On Mon, 24 Apr 2006 17:13:42 +0200 Jonas Smedegaard wrote:
> On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote:
>
> > Basically it just states waiting X seconds
>
> Please post in public rather than to me privately.
Uh, how embarrassing: I thought I was looking in my inbox, but instead
was looking in the "todo" box full of old postings I am supposed to
deal with.
Sorry for my rant - I guess I've already commented on this long time
ago.
Kind regards,
- Jonas
--
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136 Website: http://dr.jones.dk/
- Enden er n_r: http://www.shibumi.org/eoti.htm
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-04-24 15:20 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-03 1:29 Raid5 & Debian Yaird Woes Lewis Shobbrook
2006-02-03 2:13 ` dean gaudet
2006-02-03 2:14 ` Lewis Shobbrook
2006-02-03 3:02 ` dean gaudet
2006-02-03 22:58 ` Lewis Shobbrook
2006-02-04 0:22 ` dean gaudet
2006-02-04 8:35 ` Jonas Smedegaard
2006-02-04 22:07 ` Lewis Shobbrook
2006-02-07 2:52 ` dean gaudet
2006-04-24 15:13 ` Jonas Smedegaard
2006-04-24 15:20 ` Jonas Smedegaard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).