From: NeilBrown <neilb@suse.de>
To: "Baldysiak, Pawel" <pawel.baldysiak@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
"Paszkiewicz, Artur" <artur.paszkiewicz@intel.com>
Subject: Re: IMSM - problem with reshape+systemd
Date: Thu, 24 Apr 2014 14:31:38 +1000 [thread overview]
Message-ID: <20140424143138.222bd1f8@notabene.brown> (raw)
In-Reply-To: <84A53BEA6EAC69439B7E311E9B17A76F078F1EE0@IRSMSX105.ger.corp.intel.com>
[-- Attachment #1: Type: text/plain, Size: 6098 bytes --]
On Fri, 18 Apr 2014 09:24:35 +0000 "Baldysiak, Pawel"
<pawel.baldysiak@intel.com> wrote:
> Hi Neil/All.
> We have discovered some problems with IMSM array reshape under OSs managed by systemd.
> In case of reshape of arrays with IMSM metadata, mdadm manages the whole reshape process and it needs to be running in background.
> If we reboot while reshaping, array will be assembled at startup
> by udevworker by IMPORT{program}="mdadm -I /dev/sdX --export --offroot" part of udev rule array.
> Mdadm will fork and continue to reshape an array from checkpoint.
> However, systemd will treat udevworker as hanged process and it will be killed due to timeout with all its children (reshape will hang then).
> I had planned to propose a patch for this problem, where additional unit file will be added
> and udev will start systemd-service for mdadm -I command (see below),
> but then we will lose information about exported variables - the ones that are used to trigger mdadm-last-resort service.
>
> Do You have any idea how to solve this problem, and keep both functionalities?
Hi,
thanks for raising this issue.
I think we need to address this using "mdadm --grow --continue".
e.g. in used we run "mdadm -I --freeze-reshape --export" and arrange for
that to report some setting if a reshape is needed.
If it is needed, we set SYSTEMD_WANTS to some service which will run "mdadm
--grow --continue $device".
Possibly we could get mdadm to run "systemctl start mdadm-reshape@$dev"
instead of forking, like it now does for running mdmon.
I might have a poke at the code and see what falls out.
NeilBrown
>
> Pawel Baldysiak
>
> --------------------------------------------------------------------------------------------------------------
> My patch ("IMPORT{program}" behaves same as "RUN", but exports output as variables):
>
> >From 8549f0ffcd72589cedf24d07b496af2ce16d14ec Mon Sep 17 00:00:00 2001
> From: Pawel Baldysiak <pawel.baldysiak@intel.com>
> Date: Thu, 10 Apr 2014 15:16:02 +0200
> Subject: [PATCH] Use unit file for incremental assemblation from udev.
>
> Incremental assemblation of an array at OS boot is started by RUN
> command triggered by udev, so far. RUN command is used for starting
> short-time processes that will complete quickly. Some operations, like
> reshape of IMSM arrays, are managed by mdadm. In OSs managed by systemd -
> udev worker that triggered "mdadm -I" will be terminated by SIGKILL due
> to timeout. This also kills mdadm process, so reshape will stop.
>
> This patch adds new unit file, that will be started in OSs managed by
> systemd instead of "RUN=" command. Udev rule will only start the new
> service and finish its work. Unit file will start "mdadm -I" for disk
> passed as an argument from rule.
>
> In scenario where we reshape IMSM array, general migration record is
> written only on two first disks of an array, so if we reboot OS and udev
> starts adding disks from e.g. the last one, "mdadm -I" will end with
> exit code "4" due to inaccessible general migration record. This should
> also be considered as success exit status, because disk is successfully
> assembled according to its metadata. Otherwise system will log
> information about service failure.
>
> Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
> Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
> ---
> Makefile | 1 +
> systemd/mdadm-inc@.service | 10 ++++++++++
> udev-md-raid-assembly.rules | 4 +++-
> 3 files changed, 14 insertions(+), 1 deletion(-)
> create mode 100644 systemd/mdadm-inc@.service
>
> diff --git a/Makefile b/Makefile
> index b823d85..b199efd 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -288,6 +288,7 @@ install-systemd: systemd/mdmon@.service
> $(INSTALL) -D -m 644 systemd/mdmonitor.service $(DESTDIR)$(SYSTEMD_DIR)/mdmonitor.service
> $(INSTALL) -D -m 644 systemd/mdadm-last-resort@.timer $(DESTDIR)$(SYSTEMD_DIR)/mdadm-last-resort@.timer
> $(INSTALL) -D -m 644 systemd/mdadm-last-resort@.service $(DESTDIR)$(SYSTEMD_DIR)/mdadm-last-resort@.service
> + $(INSTALL) -D -m 644 systemd/mdadm-inc@.service $(DESTDIR)$(SYSTEMD_DIR)/mdadm-inc@.service
> $(INSTALL) -D -m 755 systemd/mdadm.shutdown $(DESTDIR)$(SYSTEMD_DIR)-shutdown/mdadm.shutdown
> if [ -f /etc/SuSE-release -o -n "$(SUSE)" ] ;then $(INSTALL) -D -m 755 systemd/SUSE-mdadm_env.sh $(DESTDIR)$(SYSTEMD_DIR)/../scripts/mdadm_env.sh ;fi
> diff --git a/systemd/mdadm-inc@.service b/systemd/mdadm-inc@.service
> new file mode 100644
> index 0000000..b7a97a3
> --- /dev/null
> +++ b/systemd/mdadm-inc@.service
> @@ -0,0 +1,10 @@
> +[Unit]
> +Description=MD incremental assemblation on %I
> +DefaultDependencies=no
> +Before=initrd-switch-root.target
> +
> +[Service]
> +Type=forking
> +GuessMainPID=false
> +ExecStart=/sbin/mdadm -I %I
> +SuccessExitStatus=0 4
> diff --git a/udev-md-raid-assembly.rules b/udev-md-raid-assembly.rules
> index 824e7a9..e295875 100644
> --- a/udev-md-raid-assembly.rules
> +++ b/udev-md-raid-assembly.rules
> @@ -27,7 +27,9 @@ LABEL="md_inc"
> # remember you can limit what gets auto/incrementally assembled by
> # mdadm.conf(5)'s 'AUTO' and selectively whitelist using 'ARRAY'
> -ACTION=="add|change", IMPORT{program}="/sbin/mdadm --incremental --export $devnode --offroot ${DEVLINKS}"
> +ACTION=="add|change", PROGRAM="/bin/readlink /sbin/init", RESULT=="*systemd", TAG+="systemd", ENV{SYSTEMD_WANTS}="mdadm-inc@$devnode.service"
> +ACTION=="add|change", ENV{SYSTEMD_WANTS}!="?*", IMPORT{program}="/sbin/mdadm --incremental --export $devnode --offroot ${DEVLINKS}"
> +
> ACTION=="add|change", ENV{MD_STARTED}=="*unsafe*", ENV{MD_FOREIGN}=="no", ENV{SYSTEMD_WANTS}+="mdadm-last-resort@$env{MD_DEVICE}.timer"
> ACTION=="remove", ENV{ID_PATH}=="?*", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
> ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="/sbin/mdadm -If $name"
> --
> 1.8.4.5
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next parent reply other threads:[~2014-04-24 4:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <84A53BEA6EAC69439B7E311E9B17A76F078F1EE0@IRSMSX105.ger.corp.intel.com>
2014-04-24 4:31 ` NeilBrown [this message]
2014-05-15 4:45 ` IMSM - problem with reshape+systemd NeilBrown
2014-05-16 16:07 ` Baldysiak, Pawel
2014-05-20 7:02 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140424143138.222bd1f8@notabene.brown \
--to=neilb@suse.de \
--cc=artur.paszkiewicz@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=pawel.baldysiak@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).