From: Fabio M. Di Nitto <fdinitto@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] fencing conditions: what should trigger a fencing operation?
Date: Thu, 19 Nov 2009 12:35:05 +0100 [thread overview]
Message-ID: <4B052D69.3010502@redhat.com> (raw)
Hi guys,
I have just hit what I think it?s a bug and I think we need review our
fencing policies.
This is what I saw:
- 6 nodes cluster (node1-3 x86, node4-6 x86_64)
- node1 and node4 perform a simple mount gfs2 -> wait -> umount -> wait
-> mount -> and loop forever
- node2 and node5 perform read/write operation on the same gfs2
partition (nothing fancy really)
- node3 is in charge of creating and removing clustered lv volumes.
- node6 is in charge of constantly relocating rgmanager services.
cluster is running qdisk too.
It is a known issue that node1 will crash at some point (kernel OOPS).
Here are the interesting bits:
node1 is hanging in mount/umount (expected)
node2, node4, node5 will continue to operate as normal.
node3 is now hanging creating a vg.
node6 is trying to stop service from node1 (it happened to be located
there at the time of the crash).
I was expecting, that after a failure, node1 would be fenced but nothing
is happening automatically.
Manually fencing the node will recover all hanging operations.
Talking to Steven W. it appears that our methods to define and detect a
failure should be improved.
My questions, simply driven by the fact that I am not a fence expert, are:
- what are the current fencing policies?
- what can we do to improve them?
- should we monitor for more failures than we do now?
Cheers
Fabio
next reply other threads:[~2009-11-19 11:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 11:35 Fabio M. Di Nitto [this message]
2009-11-19 17:04 ` [Cluster-devel] fencing conditions: what should trigger a fencing operation? David Teigland
2009-11-19 16:15 ` Steven Whitehouse
2009-11-19 17:28 ` David Teigland
2009-11-19 17:16 ` David Teigland
2009-11-19 18:10 ` Fabio M. Di Nitto
2009-11-19 19:49 ` David Teigland
2009-11-20 7:26 ` Fabio M. Di Nitto
2009-11-20 17:40 ` David Teigland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B052D69.3010502@redhat.com \
--to=fdinitto@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).