ocfs2-devel.oss.oracle.com archive mirror
 help / color / mirror / Atom feed
From: Sunil Mushran <sunil.mushran@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH 10/20] ocfs2/cluster: Check slots for unconfigured live nodes
Date: Tue, 14 Sep 2010 15:50:46 -0700	[thread overview]
Message-ID: <1284504656-2434-11-git-send-email-sunil.mushran@oracle.com> (raw)
In-Reply-To: <1284504656-2434-1-git-send-email-sunil.mushran@oracle.com>

o2hb currently checks slots for configured nodes only. This patch makes
it check the slots for the live nodes too to take care of a race in which
a node is removed from the configuration but not from the live map.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
---
 fs/ocfs2/cluster/heartbeat.c |   38 +++++++++++++++++++++++++++++++-------
 1 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 1d71856..de798c7 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -593,14 +593,24 @@ static int o2hb_check_slot(struct o2hb_region *reg,
 	u64 cputime;
 	unsigned int dead_ms = o2hb_dead_threshold * O2HB_REGION_TIMEOUT_MS;
 	unsigned int slot_dead_ms;
+	int tmp;
 
 	memcpy(hb_block, slot->ds_raw_block, reg->hr_block_bytes);
 
-	/* Is this correct? Do we assume that the node doesn't exist
-	 * if we're not configured for him? */
+	/*
+	 * If a node is no longer configured but is still in the livemap, we
+	 * may need to clear that bit from the livemap.
+	 */
 	node = o2nm_get_node_by_num(slot->ds_node_num);
-	if (!node)
-		return 0;
+	if (!node) {
+		spin_lock(&o2hb_live_lock);
+		tmp = test_bit(slot->ds_node_num, o2hb_live_node_bitmap);
+		spin_unlock(&o2hb_live_lock);
+		if (!tmp)
+			return 0;
+		printk(KERN_NOTICE "o2hb: Live node %d is not registered\n",
+		       slot->ds_node_num);
+	}
 
 	if (!o2hb_verify_crc(reg, hb_block)) {
 		/* all paths from here will drop o2hb_live_lock for
@@ -717,8 +727,9 @@ fire_callbacks:
 		if (list_empty(&o2hb_live_slots[slot->ds_node_num])) {
 			clear_bit(slot->ds_node_num, o2hb_live_node_bitmap);
 
-			o2hb_queue_node_event(&event, O2HB_NODE_DOWN_CB, node,
-					      slot->ds_node_num);
+			if (node)
+				o2hb_queue_node_event(&event, O2HB_NODE_DOWN_CB,
+						      node, slot->ds_node_num);
 
 			changed = 1;
 		}
@@ -738,7 +749,8 @@ out:
 
 	o2hb_run_event_list(&event);
 
-	o2nm_node_put(node);
+	if (node)
+		o2nm_node_put(node);
 	return changed;
 }
 
@@ -765,6 +777,7 @@ static int o2hb_do_disk_heartbeat(struct o2hb_region *reg)
 {
 	int i, ret, highest_node, change = 0;
 	unsigned long configured_nodes[BITS_TO_LONGS(O2NM_MAX_NODES)];
+	unsigned long live_node_bitmap[BITS_TO_LONGS(O2NM_MAX_NODES)];
 	struct o2hb_bio_wait_ctxt write_wc;
 
 	ret = o2nm_configured_node_map(configured_nodes,
@@ -774,6 +787,17 @@ static int o2hb_do_disk_heartbeat(struct o2hb_region *reg)
 		return ret;
 	}
 
+	/*
+	 * If a node is not configured but is in the livemap, we still need
+	 * to read the slot so as to be able to remove it from the livemap.
+	 */
+	o2hb_fill_node_map(live_node_bitmap, sizeof(live_node_bitmap));
+	i = -1;
+	while((i = find_next_bit(live_node_bitmap,
+				 O2NM_MAX_NODES, i + 1)) < O2NM_MAX_NODES) {
+		set_bit(i, configured_nodes);
+	}
+
 	highest_node = o2hb_highest_node(configured_nodes, O2NM_MAX_NODES);
 	if (highest_node >= O2NM_MAX_NODES) {
 		mlog(ML_NOTICE, "ocfs2_heartbeat: no configured nodes found!\n");
-- 
1.7.0.4

  parent reply	other threads:[~2010-09-14 22:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-14 22:50 [Ocfs2-devel] Global Heartbeat - fs patches Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 01/20] ocfs2/cluster: Add heartbeat mode configfs parameter Sunil Mushran
2010-09-25  8:11   ` Wengang Wang
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 02/20] ocfs2: Add an incompat feature flag OCFS2_FEATURE_INCOMPAT_CLUSTERINFO Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 03/20] ocfs2: Add support for heartbeat=global mount option Sunil Mushran
2010-09-25  8:39   ` Wengang Wang
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 04/20] ocfs2/dlm: Expose dlm_protocol in dlm_state Sunil Mushran
2010-09-25  8:42   ` Wengang Wang
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 05/20] ocfs2/cluster: Get all heartbeat regions Sunil Mushran
2010-09-23 21:57   ` Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 06/20] ocfs2/dlm: Add message DLM_QUERY_REGION Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 07/20] ocfs2: Print message if user mounts without starting global heartbeat Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 08/20] ocfs2/dlm: Add message DLM_QUERY_NODEINFO Sunil Mushran
2010-09-23 22:18   ` Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 09/20] ocfs2/cluster: Print messages when adding/removing nodes and heartbeat regions Sunil Mushran
2010-09-23 22:25   ` Joel Becker
2010-09-14 22:50 ` Sunil Mushran [this message]
2010-09-23 22:31   ` [Ocfs2-devel] [PATCH 10/20] ocfs2/cluster: Check slots for unconfigured live nodes Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 11/20] ocfs2/cluster: Reorganize o2hb debugfs init Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 12/20] ocfs2/cluster: Maintain live node bitmap per heartbeat region Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 13/20] ocfs2/cluster: Track number of global heartbeat regions Sunil Mushran
2010-09-25  9:36   ` Wengang Wang
2010-09-25  9:44     ` Joel Becker
2010-09-25 10:09       ` Wengang Wang
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 14/20] ocfs2/cluster: Track bitmap of live " Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 15/20] ocfs2/cluster: Maintain bitmap of quorum regions Sunil Mushran
2010-09-23 22:34   ` Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 16/20] ocfs2/cluster: Maintain bitmap of failed regions Sunil Mushran
2010-09-23 22:35   ` Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 17/20] ocfs2/cluster: Create debugfs files for live, quorum and failed region bitmaps Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 18/20] ocfs2/cluster: Create debugfs dir/files for each region Sunil Mushran
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 19/20] ocfs2/cluster: Add printks to show heartbeat up/down events Sunil Mushran
2010-09-23 22:36   ` Joel Becker
2010-09-14 22:50 ` [Ocfs2-devel] [PATCH 20/20] ocfs2/cluster: Show per region heartbeat elapsed time Sunil Mushran
2010-09-23 22:37 ` [Ocfs2-devel] Global Heartbeat - fs patches Joel Becker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1284504656-2434-11-git-send-email-sunil.mushran@oracle.com \
    --to=sunil.mushran@oracle.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).