* [PATCH 0/6 v2] Assorted patches relating to mdmon
@ 2023-03-13 3:42 NeilBrown
2023-03-13 3:42 ` [PATCH 1/6] Use existence of /etc/initrd-release to detect initrd NeilBrown
` (7 more replies)
0 siblings, 8 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
This series is a minor revision of the previous series with the same
name.
It includes the important bugfix identified by Mariusz and a few minor
improvements suggested by Paul.
Original comment:
mdmon is a root-storage daemon is the sense defined by systemd
documentation, but it does not follow the practice that systemd
recommends. Specifically it is run from the root filesystem when
possible. The instance started in the initrd hands-over to a root-fs
based instance, which then hands-over to an initrd-based instance
started by dracut at shutdown.
Part of the reason that we ignore systemd advise is that mdmon needs
access to the filesystem - specifically /dev and /sys - which is not
available in the initrd context after switchroot. We could possibly
change mdmon to work in the systemd-preferred way by splitting mdmon
into two processes instead of just having 2 threads. The "monitor"
process could running entirely in the initrd context, the "manager"
process could safely run in the root-fs context, passing newly opened
file descriptors to the monitor over a unix-domain socket.
But we aren't there yet and may never be.
For now, mdmon doesn't work correctly. There is no mechanism to ensure
a new instance starts after switchroot. Until recently the initrd
instance of the systemd mdmon unit would be stopped at switchroot time
because udev would temporarily forget about md devices. This would
allow the "udevadm trigger" process to start a new instance. udev was
recently fixed:
Commit: 7ec624147a41 ("udevadm: cleanup-db: don't delete information for kept db entries")
so now the attempt to start mdmon via "udevadm trigger" does nothing as
mdmon already has an active unit.
The net result is that mdmon continues running in the initrd mount
namespace and so cannot access new devices. Adding a device to a root
md array that depends on mdmon will no longer work.
We want the initrd instance of mdmon to continue running until the
root-fs based instance starts, and that really requires we have two
different systemd units. This series achieves this in the final patch by
using a different instance name inside or initrd and outside.
"initrd-mdfoo" and "mdfoo".
Other patches in the series are mostly clean-ups and minor improvements
in related code.
NeilBrown
---
NeilBrown (6):
Use existence of /etc/initrd-release to detect initrd.
Improvements for IMSM_NO_PLATFORM testing.
mdmon: don't test both 'all' and 'container_name'.
mdmon: change systemd unit file to use --foreground
mdmon: Remove need for KillMode=none
mdmon: Improve switchroot interactions.
Grow.c | 4 ++--
mdadm.h | 4 +++-
mdmon.c | 22 +++++++++++++-------
super-intel.c | 43 ++++++++++++++++++++++++++++++++++++---
systemd/mdmon@.service | 15 +++++++-------
udev-md-raid-arrays.rules | 3 ++-
util.c | 17 +++++-----------
7 files changed, 75 insertions(+), 33 deletions(-)
--
Signature
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/6] Use existence of /etc/initrd-release to detect initrd.
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-13 3:42 ` [PATCH 6/6] mdmon: Improve switchroot interactions NeilBrown
` (6 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
Since v183, systemd has used the existence of /etc/initrd-release to
detect if it is running in an initrd, rather than looking at the magic
number of the root filesystem's device. It is time for mdadm to do the
same.
Signed-off-by: NeilBrown <neilb@suse.de>
---
util.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/util.c b/util.c
index 9f1e1f7c1279..509fb43ea906 100644
--- a/util.c
+++ b/util.c
@@ -2227,15 +2227,7 @@ int continue_via_systemd(char *devnm, char *service_name)
int in_initrd(void)
{
- /* This is based on similar function in systemd. */
- struct statfs s;
- /* statfs.f_type is signed long on s390x and MIPS, causing all
- sorts of sign extension problems with RAMFS_MAGIC being
- defined as 0x858458f6 */
- return statfs("/", &s) >= 0 &&
- ((unsigned long)s.f_type == TMPFS_MAGIC ||
- ((unsigned long)s.f_type & 0xFFFFFFFFUL) ==
- ((unsigned long)RAMFS_MAGIC & 0xFFFFFFFFUL));
+ return access("/etc/initrd-release", F_OK) >= 0;
}
void reopen_mddev(int mdfd)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing.
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
` (3 preceding siblings ...)
2023-03-13 3:42 ` [PATCH 4/6] mdmon: change systemd unit file to use --foreground NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-19 16:30 ` Jes Sorensen
2023-03-13 3:42 ` [PATCH 5/6] mdmon: Remove need for KillMode=none NeilBrown
` (2 subsequent siblings)
7 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
Factor out IMSM_NO_PLATFORM testing into a single function that caches
the result.
Allow mdmon to explicitly set the result to "1" so that we don't need
the ENV var in the unit file
Check if the kernel command line contains "mdadm.imsm.test=1" and in
that case assert NO_PLATFORM. This simplifies testing in a virtual
machine.
Signed-off-by: NeilBrown <neilb@suse.de>
---
mdadm.h | 2 ++
mdmon.c | 6 ++++++
super-intel.c | 43 ++++++++++++++++++++++++++++++++++++++++---
systemd/mdmon@.service | 3 ---
4 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/mdadm.h b/mdadm.h
index 1674ce1307b2..16acb2dd1ce4 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1258,6 +1258,8 @@ extern struct superswitch super0, super1;
extern struct superswitch super_imsm, super_ddf;
extern struct superswitch mbr, gpt;
+void imsm_set_no_platform(int v);
+
struct metadata_update {
int len;
char *buf;
diff --git a/mdmon.c b/mdmon.c
index 60ba318253b9..f557e12c6533 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -318,6 +318,12 @@ int main(int argc, char *argv[])
{NULL, 0, NULL, 0}
};
+ /*
+ * mdmon should never complain due to lack of a platform,
+ * that is mdadm's job if at all.
+ */
+ imsm_set_no_platform(1);
+
while ((opt = getopt_long(argc, argv, "thaF", options, NULL)) != -1) {
switch (opt) {
case 'a':
diff --git a/super-intel.c b/super-intel.c
index e155a8ae99cb..a514dea6f95c 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -629,6 +629,43 @@ static const char *_sys_dev_type[] = {
[SYS_DEV_VMD] = "VMD"
};
+static int no_platform = -1;
+
+static int check_no_platform(void)
+{
+ static const char search[] = "mdadm.imsm.test=1";
+ int fd;
+ char buf[1024];
+ int n = 0;
+
+ if (no_platform >= 0)
+ return no_platform;
+
+ if (check_env("IMSM_NO_PLATFORM")) {
+ no_platform = 1;
+ return 1;
+ }
+ fd = open("/proc/cmdline", O_RDONLY);
+ if (fd >= 0) {
+ n = read(fd, buf, sizeof(buf)-1);
+ close(fd);
+ }
+ if (n >= (int)sizeof(search)) {
+ buf[n] = 0;
+ if (strstr(buf, search) != NULL) {
+ no_platform = 1;
+ return 1;
+ }
+ }
+ no_platform = 0;
+ return 0;
+}
+
+void imsm_set_no_platform(int v)
+{
+ no_platform = v;
+}
+
const char *get_sys_dev_type(enum sys_dev_type type)
{
if (type >= SYS_DEV_MAX)
@@ -2699,7 +2736,7 @@ static int detail_platform_imsm(int verbose, int enumerate_only, char *controlle
int result=1;
if (enumerate_only) {
- if (check_env("IMSM_NO_PLATFORM"))
+ if (check_no_platform())
return 0;
list = find_intel_devices();
if (!list)
@@ -4722,7 +4759,7 @@ static int find_intel_hba_capability(int fd, struct intel_super *super, char *de
devname);
return 1;
}
- if (!is_fd_valid(fd) || check_env("IMSM_NO_PLATFORM")) {
+ if (!is_fd_valid(fd) || check_no_platform()) {
super->orom = NULL;
super->hba = NULL;
return 0;
@@ -10697,7 +10734,7 @@ static int imsm_get_allowed_degradation(int level, int raid_disks,
******************************************************************************/
int validate_container_imsm(struct mdinfo *info)
{
- if (check_env("IMSM_NO_PLATFORM"))
+ if (check_no_platform())
return 0;
struct sys_dev *idev;
diff --git a/systemd/mdmon@.service b/systemd/mdmon@.service
index cb6482d9ff29..e7ee17a8bf91 100644
--- a/systemd/mdmon@.service
+++ b/systemd/mdmon@.service
@@ -12,9 +12,6 @@ Before=initrd-switch-root.target
Documentation=man:mdmon(8)
[Service]
-# mdmon should never complain due to lack of a platform,
-# that is mdadm's job if at all.
-Environment=IMSM_NO_PLATFORM=1
# The mdmon starting in the initramfs (with dracut at least)
# cannot see sysfs after root is mounted, so we will have to
# 'takeover'. As the '--offroot --takeover' don't hurt when
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/6] mdmon: don't test both 'all' and 'container_name'.
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
2023-03-13 3:42 ` [PATCH 1/6] Use existence of /etc/initrd-release to detect initrd NeilBrown
2023-03-13 3:42 ` [PATCH 6/6] mdmon: Improve switchroot interactions NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-13 3:42 ` [PATCH 4/6] mdmon: change systemd unit file to use --foreground NeilBrown
` (4 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
If 'all' is not set, then container_name must be NULL, as nothing else
can set it. So simplify the test to ignore container_name.
This makes the purpose of the code more obvious.
Signed-off-by: NeilBrown <neilb@suse.de>
---
mdmon.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/mdmon.c b/mdmon.c
index f557e12c6533..6d37b17c3f53 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -358,7 +358,6 @@ int main(int argc, char *argv[])
}
}
-
if (in_initrd()) {
/*
* set first char of argv[0] to @. This is used by
@@ -368,12 +367,10 @@ int main(int argc, char *argv[])
argv[0][0] = '@';
}
- if (all == 0 && container_name == NULL) {
- if (argv[optind]) {
- container_name = get_md_name(argv[optind]);
- if (!container_name)
- return 1;
- }
+ if (!all && argv[optind]) {
+ container_name = get_md_name(argv[optind]);
+ if (!container_name)
+ return 1;
}
if (container_name == NULL || argc - optind > 1)
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/6] mdmon: change systemd unit file to use --foreground
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
` (2 preceding siblings ...)
2023-03-13 3:42 ` [PATCH 3/6] mdmon: don't test both 'all' and 'container_name' NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-13 3:42 ` [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing NeilBrown
` (3 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
There is no value in mdmon forking when it is running under systemd -
systemd can still track it anyway.
So add --foreground option, and remove "Type=forking".
Signed-off-by: NeilBrown <neilb@suse.de>
---
systemd/mdmon@.service | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/systemd/mdmon@.service b/systemd/mdmon@.service
index e7ee17a8bf91..3502fadc3fc4 100644
--- a/systemd/mdmon@.service
+++ b/systemd/mdmon@.service
@@ -17,8 +17,7 @@ Documentation=man:mdmon(8)
# 'takeover'. As the '--offroot --takeover' don't hurt when
# not necessary, are are useful with root-on-md in dracut,
# have them always present.
-ExecStart=BINDIR/mdmon --offroot --takeover %I
-Type=forking
+ExecStart=BINDIR/mdmon --foreground --offroot --takeover %I
# Don't set the PIDFile. It isn't necessary (systemd can work
# it out) and systemd will remove it when transitioning from
# initramfs to rootfs.
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/6] mdmon: Remove need for KillMode=none
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
` (4 preceding siblings ...)
2023-03-13 3:42 ` [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-14 8:33 ` [PATCH 0/6 v2] Assorted patches relating to mdmon Mariusz Tkaczyk
2023-03-19 16:34 ` Jes Sorensen
7 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
mdmon needs to keep running during the switchroot out of (at boot) and
then back into (at shutdown) the initrd. It runs until a new mdmon
takes over.
Killmode=none is used to achieve this, with the help of --offroot which
sets argv[0][0] to '@' which systemd understands.
This is needed because mdmon is currently run in system-mdmon.slice
which conflicts with shutdown.target so without Killmode=none mdmon
would get killed early in shutdown when system.mdmon.slice is removed.
As described in systemd.service(5), this conflict with shutdown can be
resolved by explicitly requesting system.slice, which is a natural
counterpart to DefaultDependencies=no.
So add that, and also add IgnoreOnIsolate=true to avoid another possible
source of an early death. With these we no longer need KillMode=none
which the systemd developers have marked as "deprecated".
Signed-off-by: NeilBrown <neilb@suse.de>
---
systemd/mdmon@.service | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/systemd/mdmon@.service b/systemd/mdmon@.service
index 3502fadc3fc4..79f80b5c372e 100644
--- a/systemd/mdmon@.service
+++ b/systemd/mdmon@.service
@@ -10,6 +10,9 @@ Description=MD Metadata Monitor on /dev/%I
DefaultDependencies=no
Before=initrd-switch-root.target
Documentation=man:mdmon(8)
+# Allow mdmon to keep running after switchroot, until a new
+# instance is started.
+IgnoreOnIsolate=true
[Service]
# The mdmon starting in the initramfs (with dracut at least)
@@ -22,4 +25,6 @@ ExecStart=BINDIR/mdmon --foreground --offroot --takeover %I
# it out) and systemd will remove it when transitioning from
# initramfs to rootfs.
#PIDFile=/run/mdadm/%I.pid
-KillMode=none
+# The default slice is system-mdmon.slice which Conflicts
+# with shutdown, causing mdmon to exit early. So use system.slice.
+Slice=system.slice
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 6/6] mdmon: Improve switchroot interactions.
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
2023-03-13 3:42 ` [PATCH 1/6] Use existence of /etc/initrd-release to detect initrd NeilBrown
@ 2023-03-13 3:42 ` NeilBrown
2023-03-13 3:42 ` [PATCH 3/6] mdmon: don't test both 'all' and 'container_name' NeilBrown
` (5 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: NeilBrown @ 2023-03-13 3:42 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
We need a new mdmon@mdfoo instance to run in the root filesystem after
switch root, as /sys and /dev are removed from the initrd.
systemd will not start a new unit with the same name running while the
old unit is still active, and we want the two mdmon processes to overlap
in time to avoid any risk of deadlock, which can happen when a write is
attempted with no mdmon running.
So we need a different unit name in the initrd than in the root. Apart
from the name, everything else should be the same.
This is easily achieved using a different instance name as the
mdmon@.service unit file already supports multiple instances (for
different arrays).
So start "mdmon@mdfoo.service" from root, but
"mdmon@initrd-mdfoo.service" from the initrd. udev can tell which
circumstance is the case by looking for /etc/initrd-release.
continue_from_systemd() is enhanced so that the "initrd-" prefix can be
requested.
Teach mdmon that a container name like "initrd/foo" should be treated
just like "foo". Note that systemd passes the instance name
"initrd-foo" as "initrd/foo".
We don't need a similar mechanism at shutdown because dracut runs
"mdmon --takeover --all" when appropriate.
Signed-off-by: NeilBrown <neilb@suse.de>
---
Grow.c | 4 ++--
mdadm.h | 2 +-
mdmon.c | 7 ++++++-
systemd/mdmon@.service | 2 +-
udev-md-raid-arrays.rules | 3 ++-
util.c | 7 ++++---
6 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/Grow.c b/Grow.c
index bb5fe45c851c..06001f2d334a 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3516,7 +3516,7 @@ started:
if (!forked)
if (continue_via_systemd(container ?: sra->sys_name,
- GROW_SERVICE)) {
+ GROW_SERVICE, NULL)) {
free(fdlist);
free(offsets);
sysfs_free(sra);
@@ -3714,7 +3714,7 @@ int reshape_container(char *container, char *devname,
ping_monitor(container);
if (!forked && !freeze_reshape)
- if (continue_via_systemd(container, GROW_SERVICE))
+ if (continue_via_systemd(container, GROW_SERVICE, NULL))
return 0;
switch (forked ? 0 : fork()) {
diff --git a/mdadm.h b/mdadm.h
index 16acb2dd1ce4..1af86dc79d0f 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1605,7 +1605,7 @@ extern int same_dev(char *one, char *two);
extern int compare_paths (char* path1,char* path2);
extern void enable_fds(int devices);
extern void manage_fork_fds(int close_all);
-extern int continue_via_systemd(char *devnm, char *service_name);
+extern int continue_via_systemd(char *devnm, char *service_name, char *prefix);
extern void ident_init(struct mddev_ident *ident);
diff --git a/mdmon.c b/mdmon.c
index 6d37b17c3f53..cef5bbc8b0dd 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -368,7 +368,12 @@ int main(int argc, char *argv[])
}
if (!all && argv[optind]) {
- container_name = get_md_name(argv[optind]);
+ static const char prefix[] = "initrd/";
+ container_name = argv[optind];
+ if (strncmp(container_name, prefix,
+ sizeof(prefix) - 1) == 0)
+ container_name += sizeof(prefix)-1;
+ container_name = get_md_name(container_name);
if (!container_name)
return 1;
}
diff --git a/systemd/mdmon@.service b/systemd/mdmon@.service
index 79f80b5c372e..020cc7e15e1f 100644
--- a/systemd/mdmon@.service
+++ b/systemd/mdmon@.service
@@ -6,7 +6,7 @@
# (at your option) any later version.
[Unit]
-Description=MD Metadata Monitor on /dev/%I
+Description=MD Metadata Monitor on %I
DefaultDependencies=no
Before=initrd-switch-root.target
Documentation=man:mdmon(8)
diff --git a/udev-md-raid-arrays.rules b/udev-md-raid-arrays.rules
index 2967ace1f040..4e64b249b2db 100644
--- a/udev-md-raid-arrays.rules
+++ b/udev-md-raid-arrays.rules
@@ -38,7 +38,8 @@ ENV{MD_LEVEL}=="raid[1-9]*", ENV{SYSTEMD_WANTS}+="mdmonitor.service"
# Tell systemd to run mdmon for our container, if we need it.
ENV{MD_LEVEL}=="raid[1-9]*", ENV{MD_CONTAINER}=="?*", PROGRAM="/usr/bin/readlink $env{MD_CONTAINER}", ENV{MD_MON_THIS}="%c"
-ENV{MD_MON_THIS}=="?*", PROGRAM="/usr/bin/basename $env{MD_MON_THIS}", ENV{SYSTEMD_WANTS}+="mdmon@%c.service"
+ENV{MD_MON_THIS}=="?*", TEST=="/etc/initrd-release", PROGRAM="/usr/bin/basename $env{MD_MON_THIS}", ENV{SYSTEMD_WANTS}+="mdmon@initrd-%c.service"
+ENV{MD_MON_THIS}=="?*", TEST!="/etc/initrd-release", PROGRAM="/usr/bin/basename $env{MD_MON_THIS}", ENV{SYSTEMD_WANTS}+="mdmon@%c.service"
ENV{RESHAPE_ACTIVE}=="yes", PROGRAM="/usr/bin/basename $env{MD_MON_THIS}", ENV{SYSTEMD_WANTS}+="mdadm-grow-continue@%c.service"
LABEL="md_end"
diff --git a/util.c b/util.c
index 509fb43ea906..d70ca43bc31d 100644
--- a/util.c
+++ b/util.c
@@ -1916,6 +1916,7 @@ int start_mdmon(char *devnm)
int len;
pid_t pid;
int status;
+ char *prefix = in_initrd() ? "initrd-" : "";
char pathbuf[1024];
char *paths[4] = {
pathbuf,
@@ -1926,7 +1927,7 @@ int start_mdmon(char *devnm)
if (check_env("MDADM_NO_MDMON"))
return 0;
- if (continue_via_systemd(devnm, MDMON_SERVICE))
+ if (continue_via_systemd(devnm, MDMON_SERVICE, prefix))
return 0;
/* That failed, try running mdmon directly */
@@ -2197,7 +2198,7 @@ void manage_fork_fds(int close_all)
* 1- if systemd service has been started
* 0- otherwise
*/
-int continue_via_systemd(char *devnm, char *service_name)
+int continue_via_systemd(char *devnm, char *service_name, char *prefix)
{
int pid, status;
char pathbuf[1024];
@@ -2209,7 +2210,7 @@ int continue_via_systemd(char *devnm, char *service_name)
case 0:
manage_fork_fds(1);
snprintf(pathbuf, sizeof(pathbuf),
- "%s@%s.service", service_name, devnm);
+ "%s@%s%s.service", service_name, prefix ?: "", devnm);
status = execl("/usr/bin/systemctl", "systemctl", "restart",
pathbuf, NULL);
status = execl("/bin/systemctl", "systemctl", "restart",
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6 v2] Assorted patches relating to mdmon
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
` (5 preceding siblings ...)
2023-03-13 3:42 ` [PATCH 5/6] mdmon: Remove need for KillMode=none NeilBrown
@ 2023-03-14 8:33 ` Mariusz Tkaczyk
2023-03-19 16:34 ` Jes Sorensen
7 siblings, 0 replies; 12+ messages in thread
From: Mariusz Tkaczyk @ 2023-03-14 8:33 UTC (permalink / raw)
To: Jes Sorensen
Cc: NeilBrown, linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
On Mon, 13 Mar 2023 14:42:58 +1100
NeilBrown <neilb@suse.de> wrote:
> This series is a minor revision of the previous series with the same
> name.
> It includes the important bugfix identified by Mariusz and a few minor
> improvements suggested by Paul.
>
> Original comment:
>
> mdmon is a root-storage daemon is the sense defined by systemd
> documentation, but it does not follow the practice that systemd
> recommends. Specifically it is run from the root filesystem when
> possible. The instance started in the initrd hands-over to a root-fs
> based instance, which then hands-over to an initrd-based instance
> started by dracut at shutdown.
>
> Part of the reason that we ignore systemd advise is that mdmon needs
> access to the filesystem - specifically /dev and /sys - which is not
> available in the initrd context after switchroot. We could possibly
> change mdmon to work in the systemd-preferred way by splitting mdmon
> into two processes instead of just having 2 threads. The "monitor"
> process could running entirely in the initrd context, the "manager"
> process could safely run in the root-fs context, passing newly opened
> file descriptors to the monitor over a unix-domain socket.
>
> But we aren't there yet and may never be.
>
> For now, mdmon doesn't work correctly. There is no mechanism to ensure
> a new instance starts after switchroot. Until recently the initrd
> instance of the systemd mdmon unit would be stopped at switchroot time
> because udev would temporarily forget about md devices. This would
> allow the "udevadm trigger" process to start a new instance. udev was
> recently fixed:
>
> Commit: 7ec624147a41 ("udevadm: cleanup-db: don't delete information for kept
> db entries")
>
> so now the attempt to start mdmon via "udevadm trigger" does nothing as
> mdmon already has an active unit.
>
> The net result is that mdmon continues running in the initrd mount
> namespace and so cannot access new devices. Adding a device to a root
> md array that depends on mdmon will no longer work.
>
> We want the initrd instance of mdmon to continue running until the
> root-fs based instance starts, and that really requires we have two
> different systemd units. This series achieves this in the final patch by
> using a different instance name inside or initrd and outside.
> "initrd-mdfoo" and "mdfoo".
>
> Other patches in the series are mostly clean-ups and minor improvements
> in related code.
>
> NeilBrown
>
>
>
Hi Jes,
Sorry for the delay. We found one grow related problem and needed to determine
if it is the patchset related. It seems not, so for all patches:
Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Thanks to Neil we can revert "revert of Remove KillMode" patch (hopefully)
forever this time. Please apply those paches, then I will send a revert.
Thanks,
Mariusz
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing.
2023-03-13 3:42 ` [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing NeilBrown
@ 2023-03-19 16:30 ` Jes Sorensen
2023-03-19 22:05 ` NeilBrown
0 siblings, 1 reply; 12+ messages in thread
From: Jes Sorensen @ 2023-03-19 16:30 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
On 3/12/23 23:42, NeilBrown wrote:
> Factor out IMSM_NO_PLATFORM testing into a single function that caches
> the result.
>
> Allow mdmon to explicitly set the result to "1" so that we don't need
> the ENV var in the unit file
>
> Check if the kernel command line contains "mdadm.imsm.test=1" and in
> that case assert NO_PLATFORM. This simplifies testing in a virtual
> machine.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
> mdadm.h | 2 ++
> mdmon.c | 6 ++++++
> super-intel.c | 43 ++++++++++++++++++++++++++++++++++++++++---
> systemd/mdmon@.service | 3 ---
> 4 files changed, 48 insertions(+), 6 deletions(-)
Hi Neil
> diff --git a/super-intel.c b/super-intel.c
> index e155a8ae99cb..a514dea6f95c 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -629,6 +629,43 @@ static const char *_sys_dev_type[] = {
> [SYS_DEV_VMD] = "VMD"
> };
>
> +static int no_platform = -1;
> +
> +static int check_no_platform(void)
> +{
> + static const char search[] = "mdadm.imsm.test=1";
> + int fd;
> + char buf[1024];
This isn't safe, /proc/cmdline can be longer than 1024 characters.
Cheers,
Jes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/6 v2] Assorted patches relating to mdmon
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
` (6 preceding siblings ...)
2023-03-14 8:33 ` [PATCH 0/6 v2] Assorted patches relating to mdmon Mariusz Tkaczyk
@ 2023-03-19 16:34 ` Jes Sorensen
7 siblings, 0 replies; 12+ messages in thread
From: Jes Sorensen @ 2023-03-19 16:34 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
On 3/12/23 23:42, NeilBrown wrote:
> This series is a minor revision of the previous series with the same
> name.
> It includes the important bugfix identified by Mariusz and a few minor
> improvements suggested by Paul.
>
> Original comment:
>
> mdmon is a root-storage daemon is the sense defined by systemd
> documentation, but it does not follow the practice that systemd
> recommends. Specifically it is run from the root filesystem when
> possible. The instance started in the initrd hands-over to a root-fs
> based instance, which then hands-over to an initrd-based instance
> started by dracut at shutdown.
>
> Part of the reason that we ignore systemd advise is that mdmon needs
> access to the filesystem - specifically /dev and /sys - which is not
> available in the initrd context after switchroot. We could possibly
> change mdmon to work in the systemd-preferred way by splitting mdmon
> into two processes instead of just having 2 threads. The "monitor"
> process could running entirely in the initrd context, the "manager"
> process could safely run in the root-fs context, passing newly opened
> file descriptors to the monitor over a unix-domain socket.
>
> But we aren't there yet and may never be.
>
> For now, mdmon doesn't work correctly. There is no mechanism to ensure
> a new instance starts after switchroot. Until recently the initrd
> instance of the systemd mdmon unit would be stopped at switchroot time
> because udev would temporarily forget about md devices. This would
> allow the "udevadm trigger" process to start a new instance. udev was
> recently fixed:
>
> Commit: 7ec624147a41 ("udevadm: cleanup-db: don't delete information for kept db entries")
>
> so now the attempt to start mdmon via "udevadm trigger" does nothing as
> mdmon already has an active unit.
>
> The net result is that mdmon continues running in the initrd mount
> namespace and so cannot access new devices. Adding a device to a root
> md array that depends on mdmon will no longer work.
>
> We want the initrd instance of mdmon to continue running until the
> root-fs based instance starts, and that really requires we have two
> different systemd units. This series achieves this in the final patch by
> using a different instance name inside or initrd and outside.
> "initrd-mdfoo" and "mdfoo".
>
> Other patches in the series are mostly clean-ups and minor improvements
> in related code.
>
> NeilBrown
Applied, 1, 3-6. Skipping 2 per previous email.
Thanks,
Jes
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing.
2023-03-19 16:30 ` Jes Sorensen
@ 2023-03-19 22:05 ` NeilBrown
2023-03-20 14:58 ` Jes Sorensen
0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2023-03-19 22:05 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
On Mon, 20 Mar 2023, Jes Sorensen wrote:
> On 3/12/23 23:42, NeilBrown wrote:
> > Factor out IMSM_NO_PLATFORM testing into a single function that caches
> > the result.
> >
> > Allow mdmon to explicitly set the result to "1" so that we don't need
> > the ENV var in the unit file
> >
> > Check if the kernel command line contains "mdadm.imsm.test=1" and in
> > that case assert NO_PLATFORM. This simplifies testing in a virtual
> > machine.
> >
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> > mdadm.h | 2 ++
> > mdmon.c | 6 ++++++
> > super-intel.c | 43 ++++++++++++++++++++++++++++++++++++++++---
> > systemd/mdmon@.service | 3 ---
> > 4 files changed, 48 insertions(+), 6 deletions(-)
>
> Hi Neil
>
> > diff --git a/super-intel.c b/super-intel.c
> > index e155a8ae99cb..a514dea6f95c 100644
> > --- a/super-intel.c
> > +++ b/super-intel.c
> > @@ -629,6 +629,43 @@ static const char *_sys_dev_type[] = {
> > [SYS_DEV_VMD] = "VMD"
> > };
> >
> > +static int no_platform = -1;
> > +
> > +static int check_no_platform(void)
> > +{
> > + static const char search[] = "mdadm.imsm.test=1";
> > + int fd;
> > + char buf[1024];
>
> This isn't safe, /proc/cmdline can be longer than 1024 characters.
Why not safe? I agree that /proc/cmdline can be longer than 1024 but the
only only considers the first 1023 at most, and does so safely.
What would you prefer? Should I use conf_line() to read the line and
split it into words for us?
Thanks,
NeilBrown
>
> Cheers,
> Jes
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing.
2023-03-19 22:05 ` NeilBrown
@ 2023-03-20 14:58 ` Jes Sorensen
0 siblings, 0 replies; 12+ messages in thread
From: Jes Sorensen @ 2023-03-20 14:58 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid, Martin Wilck, Mariusz Tkaczyk, Paul Menzel
On 3/19/23 18:05, NeilBrown wrote:
> On Mon, 20 Mar 2023, Jes Sorensen wrote:
>> On 3/12/23 23:42, NeilBrown wrote:
>>> Factor out IMSM_NO_PLATFORM testing into a single function that caches
>>> the result.
>>>
>>> Allow mdmon to explicitly set the result to "1" so that we don't need
>>> the ENV var in the unit file
>>>
>>> Check if the kernel command line contains "mdadm.imsm.test=1" and in
>>> that case assert NO_PLATFORM. This simplifies testing in a virtual
>>> machine.
>>>
>>> Signed-off-by: NeilBrown <neilb@suse.de>
>>> ---
>>> mdadm.h | 2 ++
>>> mdmon.c | 6 ++++++
>>> super-intel.c | 43 ++++++++++++++++++++++++++++++++++++++++---
>>> systemd/mdmon@.service | 3 ---
>>> 4 files changed, 48 insertions(+), 6 deletions(-)
>>
>> Hi Neil
>>
>>> diff --git a/super-intel.c b/super-intel.c
>>> index e155a8ae99cb..a514dea6f95c 100644
>>> --- a/super-intel.c
>>> +++ b/super-intel.c
>>> @@ -629,6 +629,43 @@ static const char *_sys_dev_type[] = {
>>> [SYS_DEV_VMD] = "VMD"
>>> };
>>>
>>> +static int no_platform = -1;
>>> +
>>> +static int check_no_platform(void)
>>> +{
>>> + static const char search[] = "mdadm.imsm.test=1";
>>> + int fd;
>>> + char buf[1024];
>>
>> This isn't safe, /proc/cmdline can be longer than 1024 characters.
>
> Why not safe? I agree that /proc/cmdline can be longer than 1024 but the
> only only considers the first 1023 at most, and does so safely.
>
> What would you prefer? Should I use conf_line() to read the line and
> split it into words for us?
Hi Neil,
Not safe was probably not the right word. I just saw your new patch, I
think that's much better. My concern is that it failing to parse in some
conditions is going to be a pain in the neck to debug.
Thanks,
Jes
>>
>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-03-20 15:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-13 3:42 [PATCH 0/6 v2] Assorted patches relating to mdmon NeilBrown
2023-03-13 3:42 ` [PATCH 1/6] Use existence of /etc/initrd-release to detect initrd NeilBrown
2023-03-13 3:42 ` [PATCH 6/6] mdmon: Improve switchroot interactions NeilBrown
2023-03-13 3:42 ` [PATCH 3/6] mdmon: don't test both 'all' and 'container_name' NeilBrown
2023-03-13 3:42 ` [PATCH 4/6] mdmon: change systemd unit file to use --foreground NeilBrown
2023-03-13 3:42 ` [PATCH 2/6] Improvements for IMSM_NO_PLATFORM testing NeilBrown
2023-03-19 16:30 ` Jes Sorensen
2023-03-19 22:05 ` NeilBrown
2023-03-20 14:58 ` Jes Sorensen
2023-03-13 3:42 ` [PATCH 5/6] mdmon: Remove need for KillMode=none NeilBrown
2023-03-14 8:33 ` [PATCH 0/6 v2] Assorted patches relating to mdmon Mariusz Tkaczyk
2023-03-19 16:34 ` Jes Sorensen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).