All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] dlm_pick_recovery_master algorithm?
@ 2006-05-31 20:56 Daniel Phillips
  2006-05-31 21:25 ` Kurt Hackel
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Phillips @ 2006-05-31 20:56 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

I'm trying to understand the dlm_pick_recovery_master algorithm and I have
a few questions.  Which node masters the $RECOVERY resource?  Where is that
set?  What happens when that node dies?  Why can dlm_pick_recovery_master
get the EX on $RECOVERY and still not be the recovery master?

Regards,

Daniel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Ocfs2-devel] dlm_pick_recovery_master algorithm?
  2006-05-31 20:56 [Ocfs2-devel] dlm_pick_recovery_master algorithm? Daniel Phillips
@ 2006-05-31 21:25 ` Kurt Hackel
  2006-05-31 23:01   ` Daniel Phillips
  0 siblings, 1 reply; 3+ messages in thread
From: Kurt Hackel @ 2006-05-31 21:25 UTC (permalink / raw)
  To: ocfs2-devel

Hi Daniel,

> Which node masters the $RECOVERY resource?  

As with the mastery of any lock resource, any/all nodes can race simultaneously to try to master the $RECOVERY resource.  There are some small differences in the mastery process for recovery to ensure that deadlocks don't occur, and to detect and handle node death.

> Where is that set?

Almost all of this is done in fs/ocfs2/dlm/dlmmaster.c and the eventual master is set in the same way as all other lock resources, using the assert_master message.

> What happens when that node dies?

As soon as a node is seen as dead (via the heartbeat callback), cleanup occurs on all of the locks contained within lock resources that node mastered.  This includes the $RECOVERY lockres, though there is a special case in place to ensure that the $RECOVERY lockres is re-mastered at that point instead of being recovered.  Once it is remastered with the new cluster membership, it continues as normal.

> Why can dlm_pick_recovery_master
> get the EX on $RECOVERY and still not be the recovery master?

The EX lock on the $RECOVERY lockres is only used to protect the begin_reco message (the message which tells other nodes which node to recover and which will be the new master).  After that message is sent to all living nodes, the EX is dropped.  If a node has been waiting on the EX and does get it, it checks to see if the begin_reco has been sent while it was waiting.  If so, it backs off and lets the recovery master continue.

One note on all of this: this is NOT how we would like to do recovery going forward, we just did not have a solid cluster membership service in place that we could use when the mastery/recovery code was written.  Once we do have a stable mechanism and API (stop/start/finish) to depend upon, I would like to rewrite the whole thing for lock-table-based mastery and much more sensible recovery.  As it stands, it's a brittle structure that has to continually try to detect node failures inline and make adjustments as recovery is ongoing, which is no fun.

Thanks!
-kurt

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Ocfs2-devel] dlm_pick_recovery_master algorithm?
  2006-05-31 21:25 ` Kurt Hackel
@ 2006-05-31 23:01   ` Daniel Phillips
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Phillips @ 2006-05-31 23:01 UTC (permalink / raw)
  To: ocfs2-devel

Thanks Kurt, great answers!

 > You wrote:
> One note on all of this: this is NOT how we would like to do recovery
> going forward, we just did not have a solid cluster membership service
 > in place that we could use when the mastery/recovery code was written.
 > Once we do have a stable mechanism and API (stop/start/finish) to depend
 > upon, I would like to rewrite the whole thing for lock-table-based mastery
 > and much more sensible recovery.

What is the pedigree of that stop/start/finish API?  Is it the only stable
mechanism you know of to build a more sensible recovery on?

 > As it stands, it's a brittle structure
 > that has to continually try to detect node failures inline and make
 > adjustments as recovery is ongoing, which is no fun.

Not to mention, slow and not obviously terminating, indeed.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-05-31 23:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-31 20:56 [Ocfs2-devel] dlm_pick_recovery_master algorithm? Daniel Phillips
2006-05-31 21:25 ` Kurt Hackel
2006-05-31 23:01   ` Daniel Phillips

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.