* [PATCH 0/3] multipath-tools: fixes for systemd watchdog
@ 2024-11-14 14:42 Martin Wilck
2024-11-14 14:42 ` [PATCH 1/3] libmultipath: don't print error message if WATCHDOG_USEC is 0 Martin Wilck
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Martin Wilck @ 2024-11-14 14:42 UTC (permalink / raw)
To: Christophe Varoqui, Benjamin Marzinski; +Cc: Martin Wilck, dm-devel
A set of small fixes to make multipathd better adhere to the conventions
for systemd's WatchdogSec= setting.
Martin Wilck (3):
libmultipath: don't print error message if WATCHDOG_USEC is 0
libmultipath: honor WATCHDOG_PID setting
libmultipath: cut watchdog interval in half
libmultipath/config.c | 21 ++++++++++++++++++---
libmultipath/config.h | 2 ++
libmultipath/libmultipath.version | 1 +
3 files changed, 21 insertions(+), 3 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] libmultipath: don't print error message if WATCHDOG_USEC is 0
2024-11-14 14:42 [PATCH 0/3] multipath-tools: fixes for systemd watchdog Martin Wilck
@ 2024-11-14 14:42 ` Martin Wilck
2024-11-14 14:42 ` [PATCH 2/3] libmultipath: honor WATCHDOG_PID setting Martin Wilck
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2024-11-14 14:42 UTC (permalink / raw)
To: Christophe Varoqui, Benjamin Marzinski; +Cc: Martin Wilck, dm-devel
WATCHDOG_USEC may be set to 0, which means that the watchdog
is disabled in systemd.
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
libmultipath/config.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/libmultipath/config.c b/libmultipath/config.c
index 0e3a5cc..226ddec 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -865,6 +865,9 @@ static void set_max_checkint_from_watchdog(struct config *conf)
unsigned long checkint;
if (envp && sscanf(envp, "%lu", &checkint) == 1) {
+ if (checkint == 0)
+ /* watchdog disabled */
+ return;
/* Value is in microseconds */
checkint /= 1000000;
if (checkint < 1 || checkint > UINT_MAX) {
--
2.47.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/3] libmultipath: honor WATCHDOG_PID setting
2024-11-14 14:42 [PATCH 0/3] multipath-tools: fixes for systemd watchdog Martin Wilck
2024-11-14 14:42 ` [PATCH 1/3] libmultipath: don't print error message if WATCHDOG_USEC is 0 Martin Wilck
@ 2024-11-14 14:42 ` Martin Wilck
2024-11-14 14:42 ` [PATCH 3/3] libmultipath: cut watchdog interval in half Martin Wilck
2024-11-16 0:05 ` [PATCH 0/3] multipath-tools: fixes for systemd watchdog Benjamin Marzinski
3 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2024-11-14 14:42 UTC (permalink / raw)
To: Christophe Varoqui, Benjamin Marzinski; +Cc: Martin Wilck, dm-devel
WATCHDOG_USEC should only be evaluated if WATCHDOG_PID is either
unset (systemd <= 208) or set to the main daemon's pid [1].
Passing the daemon's PID to set_max_checkint_from_watchdog() requires
a mechanism similar to what we've been using for get_multipath_config().
[1] https://www.freedesktop.org/software/systemd/man/latest/sd_watchdog_enabled.html
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
libmultipath/config.c | 11 ++++++++++-
libmultipath/config.h | 2 ++
libmultipath/libmultipath.version | 1 +
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/libmultipath/config.c b/libmultipath/config.c
index 226ddec..bd199fa 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -858,12 +858,21 @@ process_config_dir(struct config *conf, char *dir)
pthread_cleanup_pop(1);
}
+__attribute__((weak)) pid_t daemon_pid = -1;
+
#ifdef USE_SYSTEMD
static void set_max_checkint_from_watchdog(struct config *conf)
{
- char *envp = getenv("WATCHDOG_USEC");
+ const char *envp;
unsigned long checkint;
+ long pid;
+ envp = getenv("WATCHDOG_PID");
+ /* See sd_watchdog_enabled(3) */
+ if (envp && sscanf(envp, "%lu", &pid) == 1 && pid != daemon_pid)
+ return;
+
+ envp = getenv("WATCHDOG_USEC");
if (envp && sscanf(envp, "%lu", &checkint) == 1) {
if (checkint == 0)
/* watchdog disabled */
diff --git a/libmultipath/config.h b/libmultipath/config.h
index 94cdf25..d12f63e 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -315,4 +315,6 @@ int parse_uid_attrs(char *uid_attrs, struct config *conf);
const char *get_uid_attribute_by_attrs(const struct config *conf,
const char *path_dev);
+/* Weak dummy function, meant to be overridden by multipathd */
+extern pid_t daemon_pid;
#endif
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 6bdf694..a898f7a 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -70,6 +70,7 @@ global:
cleanup_multipath_and_paths;
coalesce_paths;
count_active_paths;
+ daemon_pid;
delete_all_foreign;
delete_foreign;
dm_cancel_deferred_remove;
--
2.47.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] libmultipath: cut watchdog interval in half
2024-11-14 14:42 [PATCH 0/3] multipath-tools: fixes for systemd watchdog Martin Wilck
2024-11-14 14:42 ` [PATCH 1/3] libmultipath: don't print error message if WATCHDOG_USEC is 0 Martin Wilck
2024-11-14 14:42 ` [PATCH 2/3] libmultipath: honor WATCHDOG_PID setting Martin Wilck
@ 2024-11-14 14:42 ` Martin Wilck
2024-11-16 0:05 ` [PATCH 0/3] multipath-tools: fixes for systemd watchdog Benjamin Marzinski
3 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2024-11-14 14:42 UTC (permalink / raw)
To: Christophe Varoqui, Benjamin Marzinski; +Cc: Martin Wilck, dm-devel
According to [1], a daemon should trigger the watchdog every half of the time
that is communicated via WATCHDOG_USEC.
[1] https://www.freedesktop.org/software/systemd/man/latest/sd_watchdog_enabled.html
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
libmultipath/config.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/libmultipath/config.c b/libmultipath/config.c
index bd199fa..4d457b8 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -877,8 +877,11 @@ static void set_max_checkint_from_watchdog(struct config *conf)
if (checkint == 0)
/* watchdog disabled */
return;
- /* Value is in microseconds */
- checkint /= 1000000;
+ /*
+ * Value is in microseconds, and we should trigger the watchdog
+ * twice per interval.
+ */
+ checkint /= 2000000;
if (checkint < 1 || checkint > UINT_MAX) {
condlog(1, "invalid value for WatchdogSec: \"%s\"", envp);
return;
--
2.47.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 0/3] multipath-tools: fixes for systemd watchdog
2024-11-14 14:42 [PATCH 0/3] multipath-tools: fixes for systemd watchdog Martin Wilck
` (2 preceding siblings ...)
2024-11-14 14:42 ` [PATCH 3/3] libmultipath: cut watchdog interval in half Martin Wilck
@ 2024-11-16 0:05 ` Benjamin Marzinski
3 siblings, 0 replies; 5+ messages in thread
From: Benjamin Marzinski @ 2024-11-16 0:05 UTC (permalink / raw)
To: Martin Wilck; +Cc: Christophe Varoqui, Martin Wilck, dm-devel
On Thu, Nov 14, 2024 at 03:42:21PM +0100, Martin Wilck wrote:
> A set of small fixes to make multipathd better adhere to the conventions
I'm a little confused by this. Your patches make sense, but why do we
bother limiting max_checkint to the Watchdog timer? We send a notify
every loop in checkerloop() regardless. Changing max_checkint won't make
us any less likely to hang for too long if we're using a synchronous
chacker, and I can't figure out why the Watchdog time should have
anything to do with how frequently we check working paths.
Your code changes looks fine, so assuming there's a use for this code:
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
But possibly we should be removing set_max_checkint_from_watchdog(), or
replacing max_checkint with something like checker_timeout, but only for
synchronous checkers. Or possibly I'm just missing something here.
-Ben
> for systemd's WatchdogSec= setting.
>
> Martin Wilck (3):
> libmultipath: don't print error message if WATCHDOG_USEC is 0
> libmultipath: honor WATCHDOG_PID setting
> libmultipath: cut watchdog interval in half
>
> libmultipath/config.c | 21 ++++++++++++++++++---
> libmultipath/config.h | 2 ++
> libmultipath/libmultipath.version | 1 +
> 3 files changed, 21 insertions(+), 3 deletions(-)
>
> --
> 2.47.0
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-11-16 0:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-14 14:42 [PATCH 0/3] multipath-tools: fixes for systemd watchdog Martin Wilck
2024-11-14 14:42 ` [PATCH 1/3] libmultipath: don't print error message if WATCHDOG_USEC is 0 Martin Wilck
2024-11-14 14:42 ` [PATCH 2/3] libmultipath: honor WATCHDOG_PID setting Martin Wilck
2024-11-14 14:42 ` [PATCH 3/3] libmultipath: cut watchdog interval in half Martin Wilck
2024-11-16 0:05 ` [PATCH 0/3] multipath-tools: fixes for systemd watchdog Benjamin Marzinski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.