linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Linux RAID Mailing List <linux-raid@vger.kernel.org>,
	Neil Brown <neilb@suse.de>,
	"Labun, Marcin" <Marcin.Labun@intel.com>
Subject: Re: More Hot Unplug/Plug work
Date: Thu, 29 Apr 2010 14:22:12 -0700	[thread overview]
Message-ID: <r2ve9c3a7c21004291422q2144eaffsd16c8fe3c5ff8784@mail.gmail.com> (raw)
In-Reply-To: <4BD714A3.9020801@redhat.com>

On Tue, Apr 27, 2010 at 9:45 AM, Doug Ledford <dledford@redhat.com> wrote:
> So I pulled down Neil's git repo and started working from his hotunplug
> branch, which was his version of my hotunplug patch.  I had to do a
> couple minor fixes to it to make it work.  I then simply continued on
> from there.  I have a branch in my git repo that tracks his hotunplug
> branch and is also called hotunplug.  That's where my current work is at.
>
> What I've done since then:
>
> 1) I've implemented a new config file line type: DOMAIN
>   a) Each DOMAIN line must have at least one valid path= entry, but may
>      have more than one path= entry.  path= entries are file globs and
>      must match something in /dev/disk/by-path
>   b) Each DOMAIN line must have one and only one action= entry.  Valid
>      action items are: ignore, incremental, spare, grow, partition.
>      In addition, a word me be prefixed with force- to indicate that
>      we should skip certain safety checks and use the device even if it
>      isn't clean.

Just to clarify that we are on the same page with these actions:
* incremental is the default action that "does the right thing" if the
drive already has metadata.  I assume we need checks here to reject
disks with ambiguous (multiple valid metadata records)
* spare: implies incremental, but if it is a 'bare' device write a spare record
* grow: implies incremental but if it is a 'bare' device write a spare
record, if there is a degraded array in the domain rebuild it
otherwise grow an(y?) array in the domain
* partition: if the device has a partition that matches the specified
table then add the partitions incrementally

A few comments:
1/ Does 'partition' need to be split to 'partition-spare' and
'partition-grow' to imply the action post partitioning?
2/ One of the safety checks for hot-inserting a spare is that it
occurs on a port that was recently unplugged.  Should that be a
default policy or do we need a different flavor spare action like
'spare-same-port'.

>   c) Each DOMAIN line may have a metadata entry, and may have a
>      spare-group entry.

What is the purpose of the spare group?  I thought we were assuming
that all DOMAIN members were automatically in the same spare group.
Is this to augment the policy to allow spares to float between
DOMAINs?  Something like the following where the different domains
allow spares to cross boundaries?
DOMAIN path=A spare-group=B action=grow
DOMAIN path=B spare-group=A action=spare

>   d) For the partition action, a DOMAIN line must have a program= and
>      a table= entry.  Currently, the program= entry must be an item
>      out of a list of known partition programs (I'm working on getting
>      sfdisk up and running, but for arches other than x86, other
>      methods would be needed, and I'm planning on adding a method
>      that allows us to call out to a user supplied script/program
>      instead of a known internal method).  The table= entry points to
>      a file that contains a method specific table indicating the
>      necessary partition layout.  As mentioned in previous mails, we
>      only support identical partition tables at this point.  That
>      may never change, who knows.
>
> 2) Created a new udev rules file that gets installed as
> 05-md-early.rules.  This rule file, combined with our existing rules
> file, is a key element to how this domain support works.  In particular,
> udev rules allow us to separate out devices that already have some sort
> of raid superblock from devices that don't.  We then add a new flag to
> our incremental mode to indicate that a device currently does not belong
> to us, and we perform a series of checks to see if it should, and if so,
> we "grab" it (I would have preferred a better name, but the short
> options for better names were already taken).  When called with the
> "grab" flag, we follow a different code path where we check the domain
> of the device against our DOMAIN entries and if we have a match, we
> perform the specified action.  There will need to be some additional
> work to catch certain corner cases, such as the case where we have
> force-partition and we insert a disk that currently has a raid
> superblock on the bare drive.  We will currently miss that situation and
> not grab the device.  So, this is a work in progress and not yet complete.
>

I notice this rules file grabs all events.  Did you see, or disagree,
with the suggestion to have a mdadm --activate-domains command to
generate udev rules for the paths we care about?

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-04-29 21:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-27 16:45 More Hot Unplug/Plug work Doug Ledford
2010-04-27 19:41 ` Christian Gatzemeier
2010-04-28 16:08 ` Labun, Marcin
2010-04-28 17:47   ` Doug Ledford
2010-04-28 18:34     ` Labun, Marcin
2010-04-28 21:05       ` Doug Ledford
2010-04-28 21:13         ` Dan Williams
2010-04-30 13:38           ` Doug Ledford
2010-04-29  1:01         ` Neil Brown
2010-04-29  1:19           ` Dan Williams
2010-04-29  2:37             ` Neil Brown
2010-04-29 18:22               ` Labun, Marcin
2010-04-29 21:55               ` Dan Williams
2010-05-03  5:58                 ` Neil Brown
2010-05-08  1:06                   ` Dan Williams
2010-04-30 16:13               ` Doug Ledford
2010-04-30 11:14             ` John Robinson
2010-04-30 15:52           ` Doug Ledford
2010-04-28 20:59     ` Luca Berra
2010-04-28 21:16       ` Doug Ledford
2010-04-29 20:32 ` Dan Williams
2010-04-29 21:22 ` Dan Williams [this message]
2010-04-30 16:26   ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=r2ve9c3a7c21004291422q2144eaffsd16c8fe3c5ff8784@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Marcin.Labun@intel.com \
    --cc=dledford@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).