From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] fencing conditions: what should trigger a fencing operation?
Date: Thu, 19 Nov 2009 16:15:58 +0000 [thread overview]
Message-ID: <1258647358.6052.935.camel@localhost.localdomain> (raw)
In-Reply-To: <20091119170404.GA23287@redhat.com>
Hi,
On Thu, 2009-11-19 at 11:04 -0600, David Teigland wrote:
> On Thu, Nov 19, 2009 at 12:35:05PM +0100, Fabio M. Di Nitto wrote:
>
> > - what are the current fencing policies?
>
> node failure
>
I think what Fabio is asking is what event is considered to be a node
failure? It sounds from your description that it means a failure of
corosync communications. Are there other things which can feed into this
though? For example dlm seems to have some kind of timeout mechanism
which sends a message to userspace, and I wonder whether that
contributes to the decision too?
It certainly isn't desirable for all types of filesystem failure to
result in fencing & automatic recovery. I think we've got that wrong in
the past. I posted a patch a few days back to try and address some of
that. In the case we find an invalid block in a journal during recovery
we certainly don't want to try and recover the journal on another node,
nor even kill the recovering node since it will only result in another
node trying to recover the same journal and hitting the same error.
Eventually it will bring down the whole cluster.
The aim of the patch was to return a suitable status indicating why
journal recovery failed so that it can then be handled appropriately,
Steve.
next prev parent reply other threads:[~2009-11-19 16:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 11:35 [Cluster-devel] fencing conditions: what should trigger a fencing operation? Fabio M. Di Nitto
2009-11-19 17:04 ` David Teigland
2009-11-19 16:15 ` Steven Whitehouse [this message]
2009-11-19 17:28 ` David Teigland
2009-11-19 17:16 ` David Teigland
2009-11-19 18:10 ` Fabio M. Di Nitto
2009-11-19 19:49 ` David Teigland
2009-11-20 7:26 ` Fabio M. Di Nitto
2009-11-20 17:40 ` David Teigland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1258647358.6052.935.camel@localhost.localdomain \
--to=swhiteho@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).