From: NeilBrown <neilb@suse.de>
To: Hans-Peter Jansen <hpj@urpla.net>
Cc: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Persistent failures with simple md setup
Date: Mon, 4 Mar 2013 10:33:57 +1100 [thread overview]
Message-ID: <20130304103357.45b2cad2@notabene.brown> (raw)
In-Reply-To: <4291349.FrQcKOnicQ@xrated>
[-- Attachment #1: Type: text/plain, Size: 4939 bytes --]
On Sun, 03 Mar 2013 20:31:57 +0100 Hans-Peter Jansen <hpj@urpla.net> wrote:
> Am Donnerstag, 28. Februar 2013, 23:16:01 schrieb Hans-Peter Jansen:
> > Am Freitag, 1. März 2013, 08:25:20 schrieb NeilBrown:
> > > On Thu, 28 Feb 2013 11:49:53 +0100 Hans-Peter Jansen <hpj@urpla.net>
> > > wrote:
> > >
> > > This is what I asked for:
> > > > > It really feels like a udev bug, but it is hard to be sure.
> > > > > Could you edit /etc/init.d/boot.md and add
> > > > >
> > > > > echo BEFORE mdadm -IRs > /dev/kmsg
> > > > >
> > > > > just before the call the "$mdadm_BIN -IRs",
> > > > >
> > > > > echo AFTER mdadm -IRs > /dev/kmsg
> > > > >
> > > > > just after that call, and
> > > > >
> > > > > echo AFTER mdadm -Asc > /dev/mksg
> > > > >
> > > > > just after the call to "$mdadm_BIN -A -s -c $mdadm_CONFIG"
> > > > >
> > > > > And maybe put similar messages just before and after the
> > > > >
> > > > > /sbin/udevadm settle ....
> > > > >
> > > > > call.
> > > > >
> > > > > If you can then reproduce the problem, the extra logging might tell us
> > > > > something useful.
> > >
> > > This is not at all the same thing:
> > > > I've enabled initrd (linuxrc=trace) and boot.md (#!/bin/sh -x)
> > > > debugging,
> > > > hence we should see the full story next time.
> > >
> > > It might help, but I'm far from certain that it will be nearly as useful.
>
> I see, sh debug messages do not condense in /var/log/something, they're routed
> somewhere else. I not sure, how I like that systemd implied sillyness.
>
> > Okay, okay, I do it both ways now..
> >
> > Let's see, when it triggers.
>
> I'm not sure, how the basic kernel boot contribute to boiling this issue down,
> but here we go: (not censored this time, hope it will go though..)
Thanks. The interesting bit is here:
> Mar 3 09:11:59 zaphkiel kernel: [ 28.781703] md: bind<sdb4>
It seems that udev is running "mdadm -I" on sdb4 here
> Mar 3 09:11:59 zaphkiel kernel: [ 28.799939] tveeprom 1-0050: audio processor is MSP3415 (idx 6)
> Mar 3 09:11:59 zaphkiel kernel: [ 28.862699] sr 7:0:0:0: Attached scsi generic sg6 type 5
> Mar 3 09:11:59 zaphkiel kernel: [ 28.894667] tveeprom 1-0050: has radio
> Mar 3 09:11:59 zaphkiel kernel: [ 28.917194] bttv0: Hauppauge eeprom indicates model#44354
> Mar 3 09:11:59 zaphkiel kernel: [ 28.984037] bttv0: tuner type=20
> Mar 3 09:11:59 zaphkiel kernel: [ 29.003670] md: bind<sdb1>
and "mdadm -I sdb1" here
> Mar 3 09:11:59 zaphkiel kernel: [ 29.026123] input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:07.0/input/input5
> Mar 3 09:11:59 zaphkiel kernel: [ 29.098905] i2c-core: driver [msp3400] using legacy suspend method
> Mar 3 09:11:59 zaphkiel kernel: [ 29.139227] AFTER udevadm settle --timeout=60
and now udev is "settled". This is much less than 60 seconds after the
"BEFORE udevadm settle", so it didn't time out - it must have exhausted the
queue.
> Mar 3 09:11:59 zaphkiel kernel: [ 29.211092] i2c-core: driver [msp3400] using legacy resume method
> Mar 3 09:11:59 zaphkiel kernel: [ 29.276147] msp3400 1-0040: MSP3415D-B3 found @ 0x80 (bt878 #0 [sw])
> Mar 3 09:11:59 zaphkiel kernel: [ 29.325203] md: bind<sdb2>
But look - it seems that "mdadm -I /dev/sdb2" was just run - presumably by
udev.
> Mar 3 09:11:59 zaphkiel kernel: [ 29.391360] msp3400 1-0040: msp3400 supports nicam, BEFORE mdadm -IRs
We managed to get two lines blended together here, but it is clear that this
is just before "mdadm -IRs" runs.
> Mar 3 09:11:59 zaphkiel kernel: [ 29.455183] mode is autodetect
> Mar 3 09:11:59 zaphkiel kernel: [ 29.525553] md/raid1:md0: active with 1 out of 2 mirrors
>
And unsurprisingly, md0 is assembled with only on mirror because we have seen
sdb1 but not sda1
> Mar 3 09:11:59 zaphkiel kernel: [ 29.572395] i2c-core: driver [tuner] using legacy suspend method
> Mar 3 09:11:59 zaphkiel kernel: [ 29.642012] i2c-core: driver [tuner] using legacy resume method
> Mar 3 09:11:59 zaphkiel kernel: [ 29.688786] md: bind<sda4>
and here comes sda4. Presumably sda1 is attempted a little later, but as md0
is already assembled, mdadm refuses to do anything with it.
So "mdadm -IRs" is being run too early, apparently because "udevadm settle" is
exiting too early. However I have looked closely at the udev code, and run
various tests, and cannot see how it could possibly exit early. I can
imagine it exiting a bit late in some anomalous situations, but not early.
The only way I can explain this is to suggest that something other than udev
is running "mdadm -I". But I cannot imagine what would be doing that.
If you put a "sleep 10" before the "mdadm -IRs", I suspect your problems
would go away. It isn't really a nice solution, but it is the only one I can
think of at the moment.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-03-03 23:33 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-29 22:14 Persistent failures with simple md setup Hans-Peter Jansen
2013-01-30 9:07 ` Sebastian Riemer
2013-01-30 17:12 ` Hans-Peter Jansen
2013-02-04 20:43 ` Hans-Peter Jansen
2013-02-05 3:44 ` NeilBrown
2013-02-27 17:01 ` Hans-Peter Jansen
2013-02-28 3:40 ` NeilBrown
2013-02-28 10:49 ` Hans-Peter Jansen
2013-02-28 21:25 ` NeilBrown
2013-02-28 22:16 ` Hans-Peter Jansen
[not found] ` <4291349.FrQcKOnicQ@xrated>
2013-03-03 23:33 ` NeilBrown [this message]
2013-03-13 0:52 ` NeilBrown
2013-03-15 22:43 ` Hans-Peter Jansen
2013-03-18 11:20 ` Hans-Peter Jansen
2013-03-21 3:24 ` NeilBrown
2013-04-10 13:28 ` Hans-Peter Jansen
2013-04-10 13:44 ` Hans-Peter Jansen
2013-04-11 7:33 ` NeilBrown
2013-01-30 9:20 ` Roy Sigurd Karlsbakk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130304103357.45b2cad2@notabene.brown \
--to=neilb@suse.de \
--cc=hpj@urpla.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).