Re: let md auto-detect 128+ raid members, fix potential race condition

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Greaves <david@dgreaves.com>
To: Alexandre Oliva <aoliva@redhat.com>
Cc: Neil Brown <neilb@suse.de>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: let md auto-detect 128+ raid members, fix potential race condition
Date: Mon, 31 Jul 2006 22:48:11 +0100	[thread overview]
Message-ID: <44CE7A9B.8020508@dgreaves.com> (raw)
In-Reply-To: <ord5blcyg0.fsf@free.oliva.athome.lsd.ic.unicamp.br>

Alexandre Oliva wrote:
> On Jul 30, 2006, Neil Brown <neilb@suse.de> wrote:
> 
>>  1/
>>     It just isn't "right".  We don't mount filesystems from partitions
>>     just because they have type 'Linux'.  We don't enable swap on
>>     partitions just because they have type 'Linux swap'.  So why do we
>>     assemble md/raid from partitions that have type 'Linux raid
>>     autodetect'? 
> 
> Similar reason to why vgscan finds and attempts to use any partitions
> that have the appropriate type/signature (difference being that raid
> auto-detect looks at the actual partition type, whereas vgscan looks
> at the actual data, just like mdadm, IIRC): when you have to bootstrap
> from an initrd, you don't want to be forced to have the correct data
> in the initrd image, since then any reconfiguration requires the info
> to be introduced in the initrd image before the machine goes down.
> Sometimes, especially in case of disk failures, you just can't do
> that.
> 
This debate is not about generic autodetection - a good thing (tm) - but
 in-kernel vs userspace autodetection.

Your example supports Neil's case - the proposal is to use initrd to run
mdadm which thne (kinda) does what vgscan does.


> 
>> So my preferred solution to the problem is to tell people not to use
(in kernel)
>> autodetect.  Quite possibly this should be documented in the code, and
>> maybe even have a KERN_INFO message if more than 64 devices are
>> autodetected. 
> 
> I wouldn't have a problem with that, since then distros would probably
> switch to a more recommended mechanism that works just as well, i.e.,
> ideally without requiring initrd-regeneration after reconfigurations
> such as adding one more raid device to the logical volume group
> containing the root filesystem.
That's supported in today's mdadm.

look at --uuid and --name

>> So:  Do you *really* need to *fix* this, or can you just use 'mdadm'
>> to assemble you arrays instead?
> 
> I'm not sure.  I'd expect not to need it, but the limited feature
> currently in place, that initrd uses to bring up the raid1 devices
> containing the physical volumes that form the volume group where the
> logical volume with my root filesystem is also brings up various raid6
> physical volumes that form an unrelated volume group, and it does so
> in such a way that the last of them, containing the 128th fd-type
> partition in the box, ends up being left out, so the raid device it's
> a member of is brought up either degraded or missing the spare member,
> none of which are good.
> 
> I don't know that I can easily get initrd to replace nash's
> raidautorun for mdadm unless mdadm has a mode to bring up any arrays
> it can find, as opposed to bringing up a specific array out of a given
> list of members or scanning for members.  Either way, this won't fix
> the problem 2) that you mentioned, but requiring initrd-regeneration
> after extending the volume group containing the root device is another
> problem that the current modes of operation of mdadm AFAIK won't
> contemplate, so switching to it will trade one problem for another,
> and the latter is IMHO more common than the former.
> 

I think you should name your raid1 (maybe "hostname-root") and use
initrd to bring it up by --name using:
 mdadm --assemble --scan --config partitions --name hostname-root


It could also, later in the boot process, bring up "hostname-raid6" by
--name too.
 mdadm --assemble --scan --config partitions --name hostname-raid6

David


--

next prev parent reply	other threads:[~2006-07-31 21:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-30  6:56 let md auto-detect 128+ raid members, fix potential race condition Alexandre Oliva
2006-07-30 19:41 ` Andrew Morton
2006-07-30 20:56   ` Alexandre Oliva
2006-07-30 21:21     ` Andrew Morton
2006-07-30 23:20     ` Neil Brown
2006-07-31 16:34       ` Helge Hafting
2006-07-31 20:27       ` Alexandre Oliva
2006-07-31 21:48         ` David Greaves [this message]
2006-08-01  2:20           ` Alexandre Oliva
2006-08-01  8:28             ` Michael Tokarev
2006-08-01 21:24               ` Alexandre Oliva
2006-08-01  1:19         ` Neil Brown
2006-08-01  2:35           ` Alexandre Oliva
2006-08-01  3:33             ` Alexandre Oliva
2006-08-01 20:46               ` Alexandre Oliva
2006-08-02  6:37                 ` Luca Berra
2006-08-01 17:40       ` Bill Davidsen
2006-08-01 21:32         ` Alexandre Oliva
2006-08-02  6:47           ` Luca Berra
2006-08-02 16:47           ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44CE7A9B.8020508@dgreaves.com \
    --to=david@dgreaves.com \
    --cc=akpm@osdl.org \
    --cc=aoliva@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox