[Cluster-devel] cluster/group/gfs_controld lock_dlm.h plock.c ...

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

From: teigland@sourceware.org <teigland@sourceware.org>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] cluster/group/gfs_controld lock_dlm.h plock.c  ...
Date: 8 Aug 2006 21:19:18 -0000	[thread overview]
Message-ID: <20060808211918.17762.qmail@sourceware.org> (raw)

CVSROOT:	/cvs/cluster
Module name:	cluster
Changes by:	teigland at sourceware.org	2006-08-08 21:19:18

Modified files:
	group/gfs_controld: lock_dlm.h plock.c recover.c 

Log message:
	The idea to have the last node that did the checkpoint try to reuse it
	even if it wasn't the low nodeid any more doesn't work because the new
	mounter tries to read the ckpt when it gets the journals message from the
	low nodeid before the ckpt is written from the other node.  Now, the
	low nodeid is always the one to create a ckpt for a new mounter which
	means a node saving the last ckpt needs to unlink it when it sees a new
	low nodeid join the group.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/group/gfs_controld/lock_dlm.h.diff?cvsroot=cluster&r1=1.11&r2=1.12
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/group/gfs_controld/plock.c.diff?cvsroot=cluster&r1=1.10&r2=1.11
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/group/gfs_controld/recover.c.diff?cvsroot=cluster&r1=1.9&r2=1.10

--- cluster/group/gfs_controld/lock_dlm.h	2006/08/07 16:57:50	1.11
+++ cluster/group/gfs_controld/lock_dlm.h	2006/08/08 21:19:17	1.12
@@ -140,6 +140,7 @@
 	int			emulate_first_mounter;
 	int			wait_first_done;
 	int			low_finished_nodeid;
+	int			low_nodeid;
 	int			save_plocks;
 
 	uint64_t		cp_handle;
@@ -259,7 +260,7 @@
 
 int send_group_message(struct mountgroup *mg, int len, char *buf);
 
-void store_plocks(struct mountgroup *mg);
+void store_plocks(struct mountgroup *mg, int nodeid);
 void retrieve_plocks(struct mountgroup *mg);
 int dump_plocks(char *name, int fd);
 void process_saved_plocks(struct mountgroup *mg);
--- cluster/group/gfs_controld/plock.c	2006/08/08 19:37:33	1.10
+++ cluster/group/gfs_controld/plock.c	2006/08/08 21:19:17	1.11
@@ -1094,20 +1094,18 @@
 	return ret;
 }
 
-/* Copy all plock state into a checkpoint so new node can retrieve it.
+/* Copy all plock state into a checkpoint so new node can retrieve it.  The
+   node creating the ckpt for the mounter needs to be the same node that's
+   sending the mounter its journals message (i.e. the low nodeid).  The new
+   mounter knows the ckpt is ready to read only after it gets its journals
+   message.
+ 
+   If the mounter is becoming the new low nodeid in the group, the node doing
+   the store closes the ckpt and the new node unlinks the ckpt after reading
+   it.  The ckpt should then disappear and the new node can create a new ckpt
+   for the next mounter. */
 
-   The low node in the group and the previous node to create the ckpt (with
-   non-zero cp_handle) may be different if a new node joins with a lower nodeid
-   than the previous low node that created the ckpt.  In this case, the prev
-   node has the old ckpt open and will reuse it if no plock state has changed,
-   or will unlink it and create a new one.  The low node will also attempt to
-   create a new ckpt.  That open-create will either fail due to the prev node
-   reusing the old ckpt, or it will race with the open-create on the prev node
-   after the prev node unlinks the old ckpt.  Either way, when there are two
-   different nodes in the group calling store_plocks(), one of them will fail
-   at the Open(CREATE) step with ERR_EXIST due to the other. */
-
-void store_plocks(struct mountgroup *mg)
+void store_plocks(struct mountgroup *mg, int nodeid)
 {
 	SaCkptCheckpointCreationAttributesT attr;
 	SaCkptCheckpointHandleT h;
@@ -1128,8 +1126,8 @@
 
 	/* no change to plock state since we created the last checkpoint */
 	if (mg->last_checkpoint_time > mg->last_plock_time) {
-		log_group(mg, "store_plocks: ckpt uptodate");
-		return;
+		log_group(mg, "store_plocks: saved ckpt uptodate");
+		goto out;
 	}
 	mg->last_checkpoint_time = time(NULL);
 
@@ -1236,6 +1234,17 @@
 			break;
 		}
 	}
+
+ out:
+	/* If the new nodeid is becoming the low nodeid it will now be in
+	   charge of creating ckpt's for mounters instead of us. */
+
+	if (nodeid < our_nodeid) {
+		log_group(mg, "store_plocks: closing ckpt for new low node %d",
+			  nodeid);
+		saCkptCheckpointClose(h);
+		mg->cp_handle = 0;
+	}
 }
 
 /* called by a node that's just been added to the group to get existing plock
@@ -1336,7 +1345,11 @@
  out_it:
 	saCkptSectionIterationFinalize(itr);
  out:
-	saCkptCheckpointClose(h);
+	if (mg->low_nodeid == our_nodeid) {
+		log_group(mg, "retrieve_plocks: unlink ckpt from old low node");
+		unlink_checkpoint(mg, &name);
+	} else
+		saCkptCheckpointClose(h);
 }
 
 void purge_plocks(struct mountgroup *mg, int nodeid, int unmount)
--- cluster/group/gfs_controld/recover.c	2006/08/07 16:57:50	1.9
+++ cluster/group/gfs_controld/recover.c	2006/08/08 21:19:17	1.10
@@ -589,8 +589,8 @@
 	log_group(mg, "assign_journal: new member %d got jid %d",
 		  new->nodeid, new->jid);
 
-	if (mg->low_finished_nodeid == our_nodeid || mg->cp_handle)
-		store_plocks(mg);
+	if (mg->low_finished_nodeid == our_nodeid)
+		store_plocks(mg, new->nodeid);
 
 	/* if we're the first mounter and haven't gotten others_may_mount
 	   yet, then don't send journals until kernel_recovery_done_first
@@ -982,6 +982,8 @@
 	}
 
 	list_for_each_entry(memb, &mg->members, list) {
+		if (mg->low_nodeid == -1 || memb->nodeid < mg->low_nodeid)
+			mg->low_nodeid = memb->nodeid;
 		if (!memb->finished)
 			continue;
 		if (low == -1 || memb->nodeid < low)
@@ -1008,6 +1010,7 @@
 	INIT_LIST_HEAD(&mg->resources);
 	INIT_LIST_HEAD(&mg->saved_messages);
 	mg->init = 1;
+	mg->low_nodeid = -1;
 
 	strncpy(mg->name, name, MAXNAME);
 
@@ -1902,7 +1905,7 @@
 			continue;
 
 		if (!stored_plocks) {
-			store_plocks(mg);
+			store_plocks(mg, memb->nodeid);
 			stored_plocks = 1;
 		}

next             reply	other threads:[~2006-08-08 21:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-08 21:19 teigland [this message]
  -- strict thread matches above, loose matches on Subject: below --
2006-08-21 17:46 [Cluster-devel] cluster/group/gfs_controld lock_dlm.h plock.c teigland
2006-08-18 16:33 teigland
2006-08-07 16:57 teigland
2006-08-04 21:56 teigland
2006-08-02 18:27 teigland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060808211918.17762.qmail@sourceware.org \
    --to=teigland@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).