All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: dm-devel@redhat.com
Subject: Re: [PATCH 09/13] multipathd: Implement systemd watchdog integration
Date: Mon, 25 Nov 2013 17:21:15 +0100	[thread overview]
Message-ID: <529378FB.4040803@suse.de> (raw)
In-Reply-To: <5293013B.4010008@suse.de>

On 11/25/2013 08:50 AM, Hannes Reinecke wrote:
> On 11/22/2013 11:17 PM, Benjamin Marzinski wrote:
[ .. ]
>> I'm not asking for systemd to actually shut down multipathd.  In a
>> production setup, killing multipathd because it had a temporary stall
>> seems like bad default behavior.  I haven't looked at the systemd
>> watchdog code to know if this is possible, but ideally, multipathd would
>> be able to just start sending watchdog notifications again, and be able
>> to continue on with just a message in the logs recording the timeout.
>>
> Not stopping. Restarting.
> The whole point of the watchdog code is to take some action if the
> watchdog messages fail.
> We should aim for
> a) make the watchdog interval the longest interval we're prepared to
>     checkerloop to complete (hence the patch to measure the elapsed
>     time per loop iteration)
> b) have systemd restart multipathd whenever the watchdog triggers,
>     as then we're sure we can't recover from this.
>
> That should cover your sentiment, right?
>
>> I realize that there is a benefit to letting people know that there was
>> a problem, but the way it's appearing now, it will be pretty confusing to
>> the sysadmin who sees that, and filling up the logs with notification
>> rejections is pretty annoying.
>>
> Yeah, correct. We should be using the 'restart' flag in the service
> file. I did not do this as the patch went into systemd only
> recently, and one would need to figure out how to treat
> installations where an older systemd version is running.
>
And it also looks as if we'd be tripping over RH bug#982379, where
the watchdog fails to shutdown a process properly.
Which apparently is fixed in 206.
So we'd need a recent systemd for that to work properly.

I'm _quite_ sure there are errors in earlier versions, where the
watchdog feature just causes a new process to be started, without
terminating the old one. _Very_ annoying.

I'll retest with latest systemd. And make the watchdog feature
selective on the systemd version.

Cheers,

Hannes

  reply	other threads:[~2013-11-25 16:21 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-15 10:29 [PATCH 00/13] systemd integraion Hannes Reinecke
2013-11-15 10:29 ` [PATCH 01/13] Improve logging for orphan_path() Hannes Reinecke
2013-11-15 10:29 ` [PATCH 02/13] Set priority to '0' for PATH_BLOCKED or PATH_DOWN Hannes Reinecke
2013-11-15 10:29 ` [PATCH 03/13] libmultipath: fixup strlcpy Hannes Reinecke
2013-11-15 10:29 ` [PATCH 04/13] libmultipath: return error numbers from sysfs_get_XXX Hannes Reinecke
2013-11-17 17:34   ` Christophe Varoqui
2013-11-18  6:51     ` Hannes Reinecke
2013-11-15 10:29 ` [PATCH 05/13] libmultipath: do not stall on recv_packet() Hannes Reinecke
2013-11-15 10:29 ` [PATCH 06/13] multipathd: switch to socket activation for systemd Hannes Reinecke
2013-11-15 10:29 ` [PATCH 07/13] multipathd: use sd_notify() to inform systemd Hannes Reinecke
2013-11-15 10:29 ` [PATCH 08/13] multipathd: Add option '-s' to suppress timestamps Hannes Reinecke
2013-11-15 10:29 ` [PATCH 09/13] multipathd: Implement systemd watchdog integration Hannes Reinecke
2013-11-22 22:17   ` Benjamin Marzinski
2013-11-25  7:50     ` Hannes Reinecke
2013-11-25 16:21       ` Hannes Reinecke [this message]
2013-11-15 10:29 ` [PATCH 10/13] multipathd: enable core dumps for systemd Hannes Reinecke
2013-11-15 10:29 ` [PATCH 11/13] multipathd: Read environment variables from systemd Hannes Reinecke
2013-11-15 10:29 ` [PATCH 12/13] multipathd: measure path check time Hannes Reinecke
2013-11-15 10:29 ` [PATCH 13/13] multipathd: no_map_shutdown option Hannes Reinecke
2013-11-21 23:17   ` Benjamin Marzinski
2013-11-22  9:12     ` Hannes Reinecke
2013-11-22  9:30       ` Christophe Varoqui
2013-11-22 10:04         ` Hannes Reinecke
2013-11-22 10:11           ` Christophe Varoqui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=529378FB.4040803@suse.de \
    --to=hare@suse.de \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.