From: "Nicolas de Pesloüan" <nicolas.2p.debian@gmail.com>
To: Patrick Schaaf <netdev@bof.de>
Cc: netdev@vger.kernel.org
Subject: Re: RFC: pid "ownership" of ip config information
Date: Sun, 23 Jan 2011 13:32:53 +0100 [thread overview]
Message-ID: <4D3C1FF5.2010607@gmail.com> (raw)
In-Reply-To: <1295778271.5657.7.camel@lat1>
Le 23/01/2011 11:24, Patrick Schaaf a écrit :
> On Fri, 2011-01-21 at 11:17 +0100, Nicolas de Pesloüan wrote:
>> Le 21/01/2011 10:28, Patrick Schaaf a écrit :
>>> The alternative to such a feature, would be to have an additional
>>> monitoring process, which would watch the PID somehow, and need to
>>> be configured to know what to withdraw when it dies.
>
>> There exists some user space clustering system that should provide the same functionalities. Did you
>> had a look at http://www.linux-ha.org/ ?
>
> Those would be the more complex instances of "an additional monitoring
> process", right?
>
> What happens when heartbeat is "kill -9"ed? Assume that I want to avoid
> STOMITH like approaches.
>
> My proposal could be _used_ by such complex clustering managers, too.
>
> Or, did I overlook there a kernel based solution to "withdraw IP config
> when processes die"?
>
> Can you provide a direct link on linux-ha?
Do you consider "withdraw IP config" the only feature that is needed when a process die ? Or shall
we instead design a more generic framework to run a command or call a system call when a process die
? /sbin/init is probably already doing something similar. Arguably, even init mail hang...
If your point is to provide a safety net for very sick but not really died node, then, no userland
system would help. As such, I agree with you that an automatic withdraw of IP config might help.
However, how would you protect against a simple never ending loop in the process or against very
slow process due to high load on the node? You probably also need to guard against process not
reading the network receive queue anymore.
This might end up with some sort of local heart beating monitoring of userland process, in the
kernel, and I'm not sure if someone would support this.
And whatever you do locally to a node to ensure proper operation, you need a way to also check for
proper operation from outside of the node. A STOMITH system is always required, in order to kill a
totally mad node. Even the kernel may become mad.
Nicolas.
prev parent reply other threads:[~2011-01-23 12:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-21 9:28 RFC: pid "ownership" of ip config information Patrick Schaaf
2011-01-21 10:17 ` Nicolas de Pesloüan
2011-01-23 10:24 ` Patrick Schaaf
2011-01-23 12:32 ` Nicolas de Pesloüan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D3C1FF5.2010607@gmail.com \
--to=nicolas.2p.debian@gmail.com \
--cc=netdev@bof.de \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).