From: Neil Brown <neilb@suse.de>
To: "Hawrylewicz Czarnowski,
Przemyslaw" <przemyslaw.hawrylewicz.czarnowski@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
"Neubauer, Wojciech" <Wojciech.Neubauer@intel.com>,
"Williams, Dan J" <dan.j.williams@intel.com>,
"Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
"Labun, Marcin" <Marcin.Labun@intel.com>,
"Czarnowska, Anna" <anna.czarnowska@intel.com>
Subject: Re: Autorebuild, new dynamic udev rules for hot-plugs
Date: Mon, 22 Nov 2010 16:02:01 +1100 [thread overview]
Message-ID: <20101122160201.24ad85f8@notabene.brown> (raw)
In-Reply-To: <66C59AD0932712458090B447266D638C0109E0E1E7@irsmsx504.ger.corp.intel.com>
On Fri, 19 Nov 2010 15:12:19 +0000
"Hawrylewicz Czarnowski, Przemyslaw"
<przemyslaw.hawrylewicz.czarnowski@intel.com> wrote:
> Hi,
>
> I would like to present another patch for our autorebuild tree (it should be applied on the top of Autorebuild series, as devel-3.2 is not stable yet). Dan Williams proposed to reduce overhead associated with passing each hot-plugged block device to mdadm, using just the devices described in policies via mdadm.conf file.
> See patch below for details, comments are as always very welcome.
I am in general in favour of this approach.
It has the benefit that it can very easily be not-used if there turn out to
be problems with it.
Four comments.
1/ I wouldn't write a file in /lib/udev/rules.d/
I think it should be written to "/dev/.udev/rules.d/"
which is referred to as the "temporary rules directory"
in the udev documentation.
2/ I would be good to process the type=disk or type=part part of the
policy into the rules file as well.
3/ I'm not very comfortable with hard-coding the name of the
file to be created in the rules.d directory. Maybe usage could be
--activate-domains=63-md-whatever
4/ I don't think it is good to have an incomplete file in rules.d that udev
might accidentally read. We should create the file with a name with a
leading '.' (assuming udev ignores those, I haven't checked) and then
rename it after it has been completely written.
Other than that, it looks pretty good.
Thanks,
NeilBrown
>
> Date: Thu, 18 Nov 2010 01:19:52 +0100
> Subject: [PATCH] New dynamic hot-plug udev rules for policies
>
> When introducing policies, new hot-plug rules were added to support
> bare disks. Mdadm was started for each hot plugged block device
> to determine if it could be used as spare or as a replacement member for
> degraded array.
> This patch introduces limitation of range of devices that are handled
> by mdadm. It limits them to the ones specified in domains associated
> with the actions: spare-same-port, spare and spare-force.
> In order to enable hot-plug for bare disks one must update udev rules
> with command
>
> mdadm --activate-domains
>
> After mdadm.conf is changed one is obliged to re-run
> "mdadm --activate-domains" command in order to bring the system
> configuration up to date.
> All hot-plugged disks containing metadata are still handled by existing
> rules.
>
> Note: this patch is just a proposition to minimize overhead of using mdadm for
> each plugged block device. If accepted, it will be incorporated in
> previous implementation of hot-plug for bare disks.
>
> Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com>
> ---
> Makefile | 2 +
> ReadMe.c | 1 +
> mdadm.c | 4 ++
> mdadm.h | 9 +++-
> policy.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> udev-md-raid.rules | 5 +-
> 6 files changed, 154 insertions(+), 4 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 2b88818..eeae8e1 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -73,9 +73,11 @@ MAP_FILE = map
> MDMON_DIR = /dev/.mdadm
> # place for autoreplace cookies
> FAILED_SLOTS_DIR = /dev/.mdadm/failed-slots
> +UDEV_RULES_DIR = $(DESTDIR)/lib/udev/rules.d
> DIRFLAGS = -DMAP_DIR=\"$(MAP_DIR)\" -DMAP_FILE=\"$(MAP_FILE)\"
> DIRFLAGS += -DMDMON_DIR=\"$(MDMON_DIR)\"
> DIRFLAGS += -DFAILED_SLOTS_DIR=\"$(FAILED_SLOTS_DIR)\"
> +DIRFLAGS += -DUDEV_RULES_DIR=\"$(UDEV_RULES_DIR)\"
> CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS)
>
> # The glibc TLS ABI requires applications that call clone(2) to set up
> diff --git a/ReadMe.c b/ReadMe.c
> index 54a1998..f17b8a1 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -110,6 +110,7 @@ struct option long_options[] = {
> {"detail-platform", 0, 0, DetailPlatform},
> {"kill-subarray", 1, 0, KillSubarray},
> {"update-subarray", 1, 0, UpdateSubarray},
> + {"activate-domains", 0, 0, ActivateDomains},
>
> /* synonyms */
> {"monitor", 0, 0, 'F'},
> diff --git a/mdadm.c b/mdadm.c
> index c9a172a..b5403cf 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -228,6 +228,7 @@ int main(int argc, char *argv[])
> }
> subarray = optarg;
> }
> + case ActivateDomains:
> case 'K': if (!mode) newmode = MISC; break;
> case NoSharing: newmode = MONITOR; break;
> }
> @@ -841,6 +842,7 @@ int main(int argc, char *argv[])
> case O(MISC, DetailPlatform):
> case O(MISC, KillSubarray):
> case O(MISC, UpdateSubarray):
> + case O(MISC, ActivateDomains):
> if (devmode && devmode != opt &&
> (devmode == 'E' || (opt == 'E' && devmode != 'Q'))) {
> fprintf(stderr, Name ": --examine/-E cannot be given with ");
> @@ -1421,6 +1423,8 @@ int main(int argc, char *argv[])
> free_mdstat(ms);
> } while (!last && err);
> if (err) rv |= 1;
> + } else if (devmode == ActivateDomains) {
> + rv = Activate_Domains();
> } else {
> fprintf(stderr, Name ": No devices given.\n");
> exit(2);
> diff --git a/mdadm.h b/mdadm.h
> index 9b4a1a8..171aa69 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -101,6 +101,11 @@ extern __off64_t lseek64 __P ((int __fd, __off64_t __offset, int __whence));
> #define FAILED_SLOTS_DIR "/dev/.mdadm/failed-slots"
> #endif /* FAILED_SLOTS */
>
> +/* UDEV_RULES_DIR is the place, where udev holds its rules */
> +#ifndef UDEV_RULES_DIR
> +#define UDEV_RULES_DIR "/lib/udev/rules.d"
> +#endif /* UDEV_RULES_DIR */
> +
> #include "md_u.h"
> #include "md_p.h"
> #include "bitmap.h"
> @@ -289,7 +294,8 @@ enum special_options {
> KillSubarray,
> UpdateSubarray, /* 16 */
> IncrementalPath,
> - NoSharing
> + NoSharing,
> + ActivateDomains
> };
>
> /* structures read from config file */
> @@ -971,6 +977,7 @@ extern int CreateBitmap(char *filename, int force, char uuid[16],
> unsigned long long array_size,
> int major);
> extern int ExamineBitmap(char *filename, int brief, struct supertype *st);
> +extern int Activate_Domains(void);
> extern int bitmap_update_uuid(int fd, int *uuid, int swap);
> extern unsigned long bitmap_sectors(struct bitmap_super_s *bsb);
>
> diff --git a/policy.c b/policy.c
> index daee52a..e4c53d5 100644
> --- a/policy.c
> +++ b/policy.c
> @@ -753,3 +753,140 @@ void domain_free(struct domainlist *dl)
> free(head);
> }
> }
> +
> +/* invocation of udev rule file */
> +char udev_template_start[] =
> +"# do not edit this file, it will be overwritten on update\n"
> +"\n"
> +"SUBSYSTEM!=\"block\", GOTO=\"md_autorebuild_end\"\n"
> +"\n"
> +"ENV{ID_FS_TYPE}==\"linux_raid_member\", GOTO=\"md_autorebuild_end\"\n"
> +"ENV{ID_FS_TYPE}==\"isw_raid_member\", GOTO=\"md_autorebuild_end\"\n"
> +"\n";
> +
> +/* ending of udev rule file */
> +char udev_template_end[] =
> +"\n"
> +"LABEL=\"md_autorebuild_end\"\n"
> +"\n";
> +
> +/* find rule named rule_type and return its value */
> +char *find_rule(struct rule *rule, char *rule_type)
> +{
> + while (rule) {
> + if (rule->name == rule_type)
> + return rule->value;
> +
> + rule = rule->next;
> + }
> + return NULL;
> +}
> +
> +#define UDEV_RULE_FORMAT \
> +"ACTION==\"add\", KERNEL!=\"md*\" ENV{ID_PATH}==\"%s\" " \
> +"RUN+=\"/sbin/mdadm --incremental $env{DEVNAME}\", " \
> +"GOTO=\"md_autorebuild_end\"\n" \
> +
> +/* Write rule in the rule file. Use format from UDEV_RULE_FORMAT */
> +int write_rule(struct rule *rule, int fd)
> +{
> + char line[1024];
> + char *r = find_rule(rule, rule_path);
> + if (!r)
> + return -1;
> +
> + snprintf(line, sizeof(line) - 1, UDEV_RULE_FORMAT, r);
> + return write(fd, line, strlen(line));
> +}
> +
> +/* Generate single entry in udev rule basing on POLICY line found in config
> + * file. Take only those with paths, only first occurrence if paths are equal
> + * and if actions supports handling of spares (>=act_spare_same_slot)
> + */
> +int generate_entries(int fd)
> +{
> + struct pol_rule *loop, *dup;
> + char *loop_value, *dup_value;
> + int duplicate;
> + int written = 0;
> +
> + for (loop = config_rules; loop; loop = loop->next) {
> + if (loop->type != rule_policy)
> + continue;
> + duplicate = 0;
> +
> + /* only policies with paths and with actions supporting
> + * bare disks are considered */
> + loop_value = find_rule(loop->rule, pol_act);
> + if (!loop_value || map_act(loop_value) < act_spare_same_slot)
> + continue;
> + loop_value = find_rule(loop->rule, rule_path);
> + if (!loop_value)
> + continue;
> + for (dup = config_rules; dup != loop; dup = dup->next) {
> + if (dup->type != rule_policy)
> + continue;
> + dup_value = find_rule(dup->rule, pol_act);
> + if (!dup_value || map_act(dup_value) < act_spare_same_slot)
> + continue;
> + dup_value = find_rule(dup->rule, rule_path);
> + if (!dup_value)
> + continue;
> + if (strcmp(loop_value, dup_value) == 0) {
> + duplicate = 1;
> + break;
> + }
> + }
> +
> + /* not a dup or first occurrence */
> + if (!duplicate) {
> + if (write_rule(loop->rule, fd) == -1)
> + return 0;
> + written++;
> + }
> + }
> + return written;
> +}
> +
> +#define AR_UDEV_RULE_FILE UDEV_RULES_DIR "/63-md-raid-autorebuild.rules"
> +
> +/* Activate_Domains routine creates dynamic udev rules used to handle
> + * hot-plug events for bare devices (and making them spares)
> + */
> +int Activate_Domains(void)
> +{
> + int fd = 0;
> + int rv;
> +
> + fd = creat(AR_UDEV_RULE_FILE,
> + S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
> + if (fd == -1)
> + return 1;
> +
> + /* write static invocation */
> + if (write(fd, udev_template_start,
> + sizeof(udev_template_start) - 1) == -1) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return 1;
> + }
> +
> + /* iterate, if none created or error occurred, remove file */
> + rv = generate_entries(fd);
> + if (rv <= 0) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return rv == -1 ? -1 : 0;
> + }
> +
> + /* write ending */
> + if (write(fd, udev_template_end, sizeof(udev_template_end) - 1) == -1) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return 1;
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> diff --git a/udev-md-raid.rules b/udev-md-raid.rules
> index 36dd51e..11057bb 100644
> --- a/udev-md-raid.rules
> +++ b/udev-md-raid.rules
> @@ -3,11 +3,10 @@
> SUBSYSTEM!="block", GOTO="md_end"
>
> # handle potential components of arrays
> +ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
> ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
> +ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
> ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
> -# try incremental for each block device and not md. Most cases should not create
> -# unnecessary overhead, as no "heavy" disk operations are performed
> -ACTION=="add", KERNEL!="md*", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
>
> # handle md arrays
> ACTION!="add|change", GOTO="md_end"
next prev parent reply other threads:[~2010-11-22 5:02 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-29 14:13 [Patch 00/17] Autorebuild Czarnowska, Anna
2010-11-17 10:22 ` Neil Brown
2010-11-17 16:04 ` Labun, Marcin
2010-11-18 23:14 ` Czarnowska, Anna
2010-11-19 12:43 ` Devel 3.2 branch issues Czarnowska, Anna
2010-11-22 3:29 ` Neil Brown
2010-11-22 17:18 ` Labun, Marcin
2010-11-22 18:47 ` Dan Williams
2010-11-23 17:34 ` Labun, Marcin
2010-11-19 15:12 ` Autorebuild, new dynamic udev rules for hot-plugs Hawrylewicz Czarnowski, Przemyslaw
2010-11-22 5:02 ` Neil Brown [this message]
2010-11-22 23:50 ` Hawrylewicz Czarnowski, Przemyslaw
2010-11-23 0:11 ` Dan Williams
2010-11-23 1:17 ` Neil Brown
2010-11-23 5:04 ` Dan Williams
2010-11-23 5:27 ` Neil Brown
2010-11-23 6:17 ` Dan Williams
2010-11-23 17:01 ` Hawrylewicz Czarnowski, Przemyslaw
2010-12-23 15:44 ` Hawrylewicz Czarnowski, Przemyslaw
2010-11-22 2:16 ` [Patch 00/17] Autorebuild Neil Brown
2010-11-22 15:08 ` Czarnowska, Anna
2010-11-23 1:34 ` Neil Brown
2010-11-23 18:20 ` Labun, Marcin
2010-12-09 11:40 ` Czarnowska, Anna
2010-12-13 0:21 ` Neil Brown
2010-12-14 14:47 ` [PATCH] fix: Monitor doesn't return after starting daemon Czarnowska, Anna
2010-12-14 21:58 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101122160201.24ad85f8@notabene.brown \
--to=neilb@suse.de \
--cc=Marcin.Labun@intel.com \
--cc=Wojciech.Neubauer@intel.com \
--cc=anna.czarnowska@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ed.ciechanowski@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=przemyslaw.hawrylewicz.czarnowski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).