All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Teigland <teigland@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] cluster/group/daemon cman.c cpg.c gd_internal. ...
Date: Tue, 20 Jun 2006 15:06:02 -0500	[thread overview]
Message-ID: <20060620200602.GB12160@redhat.com> (raw)
In-Reply-To: <44984FF6.9010406@redhat.com>

On Tue, Jun 20, 2006 at 02:43:50PM -0500, Robert Peterson wrote:
> David Teigland wrote:
> >Might be a good idea, I don't really know.  I'm not even sure we'd need to
> >save much or any additional state that couldn't be pulled from the gfs/dlm
> >instances themselves.  It seems to me the challenge would be writing the
> >daemons so they could put all the pieces and interconnections back
> >together again.
> >
> >If this ends up being a big enough problem to get more attention, I think
> >the first practical improvement we could make is something like
> >blocking/clearing i/o from the residual fs's (like we do in withdraw) and
> >adding the ability to fully purge instances of gfs/dlm from the kernel
> >without rebooting the node.  Then the machines could all start from
> >scratch without rebooting or fencing
> Here's another idea that came to me:
> 
> For critical cluster processes like cman and fenced, maybe we could use
> init's ability to restart processes, i.e. the "respawn" option in
> /etc/inittab.  Maybe we can use "respawn" or something similar to ensure
> that if a critical process like fenced dies, it gets restarted
> automatically and immediately.  Of course, that might cause problems for
> shutdown, etc., and it would probably make it harder to test certain
> things...

Assuming the daemon is managing something, then the failure amounts to a
full node failure and the node needs to be recovered by the other nodes.
Respawning the daemons in that case probably just gets in the way of the
other nodes doing recovery.  Respawning would make sense if the daemons
failed when they weren't managing anything, but that's pretty unlikely.

Dave



  reply	other threads:[~2006-06-20 20:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-20 18:09 [Cluster-devel] cluster/group/daemon cman.c cpg.c gd_internal. teigland
2006-06-20 18:56 ` Robert Peterson
2006-06-20 19:19   ` David Teigland
2006-06-20 19:43     ` Robert Peterson
2006-06-20 20:06       ` David Teigland [this message]
2006-06-20 20:13       ` Steven Dake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060620200602.GB12160@redhat.com \
    --to=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.