All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sunil Mushran <Sunil.Mushran@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] Please help me in getting OCFS2 design doc.
Date: Mon, 08 May 2006 10:50:10 -0700	[thread overview]
Message-ID: <445F84D2.908@oracle.com> (raw)
In-Reply-To: <c04850370605080623r3867621bk8f036c856691e9e7@mail.gmail.com>

Sum Sha wrote:
> Thanks again for providing this information. I thought, we create
> volumes with ASM and then format those volumes with OCFS2 and then
> mount them. I probably missed it ;-)
>
> One more thing which I'd be interested to discuss is "heartbeat
> mechanism" and "self-fencing" behaviour of OCFS2.
>
> I have read that
> "An active node is deemed dead if it does not update its timestamp for
> O2CB_HEARTBEAT_THRESHOLD (default=7) loops"
>                                           and
> "A node self-fences if it fails to update its timestamp for
> ((O2CB_HEARTBEAT_THRESHOLD - 1) * 2) secs. The [o2hb-xx] kernel
> thread, after every timestamp write, sets a timer to panic the system
> after that duration. If the next timestamp is written within that
> duration, as it should, it first cancels that timer before setting up
> a new one"
>
> Here, the first case seems to be dependent on the second one, isn't
> it? If a node is not able to see other nodes' timestamp within
> (O2CB_HEARTBEAT_THRESHOLD * 2) time, then it assumes one of the
> following things:
>
> 1. The other node could not put timestamp within
> (O2CB_HEARTBEAT_THRESHOLD - 1) * 2 time and paniced itself.
>                                OR
> 2. The other node is actually dead and we give extra 2 seconds to
> detect that. Are we giving these extra 2 seconds to [hb-xx] kernel
> thread for detecting this scenario?
>
>   
I am not sure what the difference is between the two. The other nodes don't
care what the reason is for the node not to be able to update the hb. 
All they
care is whether it was updated or not. Also, the extra 2 secs should be 
viewed
from the other side... the node panics itself 2 secs before the other 
will deem
it dead and kick it off the cluster.

  reply	other threads:[~2006-05-08 17:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-02 14:15 [Ocfs2-devel] Please help me in getting OCFS2 design doc Sum Sha
2006-05-02 17:11 ` Sunil Mushran
2006-05-03 11:18   ` Sum Sha
2006-05-03 23:51     ` Sunil Mushran
2006-05-08 13:23       ` Sum Sha
2006-05-08 17:50         ` Sunil Mushran [this message]
  -- strict thread matches above, loose matches on Subject: below --
2006-04-27 14:50 Sum Sha
2006-04-27 17:03 ` Sunil Mushran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=445F84D2.908@oracle.com \
    --to=sunil.mushran@oracle.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.