cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] cluster4 gfs_controld
@ 2011-10-13 14:20 David Teigland
  2011-10-13 14:41 ` Steven Whitehouse
  2011-10-13 15:02 ` Masatake YAMATO
  0 siblings, 2 replies; 12+ messages in thread
From: David Teigland @ 2011-10-13 14:20 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Here's the outline of my plan to remove/replace the essential bits of
gfs_controld in cluster4.  I expect it'll go away entirely, but there
could be one or two minor things it would still handle on the side.

kernel dlm/gfs2 will continue to be operable with either
. cluster3 dlm_controld/gfs_controld combination, or
. cluster4 dlm_controld only

Two main things from gfs_controld need replacing:

1. jid allocation, first mounter

cluster3
. both from gfs_controld

cluster4
. jid from dlm-kernel "slots" which will be assigned similarly
. first mounter using a dlm lock in lock_dlm

2. recovery coordination, failure notification

cluster3
. coordination of dlm-kernel/gfs-kernel recovery is done
  indirectly in userspace between dlm_controld/gfs_controld,
  which then toggle sysfs files.
. write("sysfs block", 0) -> block_store(1)
  write("sysfs recover", jid) -> recover_store(jid)
  write("sysfs block", 1) -> block_store(0)

cluster4
. coordination of dlm-kernel/gfs-kernel recovery is done
  directly in kernel using callbacks from dlm-kernel to gfs-kernel.
. gdlm_mount(struct gfs2_sbd *sdp, const char *table, int *first, int *jid)
  calls dlm_recover_register(dlm, &jid, &recover_callbacks)
. gdlm_recover_prep() -> block_store(1)
  gdlm_recover_slot(jid) -> recover_store(jid)
  gdlm_recover_done() -> block_store(0)

cluster3 dlm/gfs recovery
. dlm_controld sees nodedown                      (libcpg)
. gfs_controld sees nodedown                      (libcpg)
. dlm_controld stops dlm-kernel                   (sysfs control 0)
. gfs_controld stops gfs-kernel                   (sysfs block 1)
. dlm_controld waits for gfs_controld kernel stop (libdlmcontrol)
. gfs_controld waits for dlm_controld kernel stop (libdlmcontrol)
. dlm_controld syncs state among all nodes        (libcpg)
. gfs_controld syncs state among all nodes        (libcpg)
. dlm_controld starts dlm-kernel recovery         (sysfs control 1)
. gfs_controld starts gfs-kernel recovery         (sysfs recover jid)
. gfs_controld starts gfs-kernel                  (sysfs block 0)

cluster4 dlm/gfs recovery
. dlm_controld sees nodedown                      (libcpg)
. dlm_controld stops dlm-kernel                   (sysfs control 0)
. dlm-kernel stops gfs-kernel                     (callback block 1)
. dlm_controld syncs state among all nodes        (libcpg)
. dlm_controld starts dlm-kernel recovery         (sysfs control 1)
. dlm-kernel starts gfs-kernel recovery           (callback recover jid)
. dlm-kernel starts gfs-kernel                    (callback block 0)



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-10-14  9:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-13 14:20 [Cluster-devel] cluster4 gfs_controld David Teigland
2011-10-13 14:41 ` Steven Whitehouse
2011-10-13 15:30   ` David Teigland
2011-10-13 16:16     ` Steven Whitehouse
2011-10-13 16:49       ` David Teigland
2011-10-13 20:30     ` Lon Hohberger
2011-10-14  3:53       ` Fabio M. Di Nitto
2011-10-14  9:23         ` Andrew Beekhof
2011-10-13 15:02 ` Masatake YAMATO
2011-10-13 15:33   ` David Teigland
2011-10-13 19:00     ` Masatake YAMATO
2011-10-13 16:17   ` Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).