From mboxrd@z Thu Jan  1 00:00:00 1970
From: jbrassow@sourceware.org <jbrassow@sourceware.org>
Date: 13 Feb 2008 15:06:23 -0000
Subject: [Cluster-devel] cluster/cmirror-kernel/src dm-clog.c
Message-ID: <20080213150623.25294.qmail@sourceware.org>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL5
Changes by:	jbrassow at sourceware.org	2008-02-13 15:06:23

Modified files:
	cmirror-kernel/src: dm-clog.c 

Log message:
	- change the way 'is_remote_recovering' works to improve overall
	performance.
	
	Before a mirror issues a write, it must call 'is_remote_recovering'
	to ensure that another machine will not be recovering the region during
	the write.  This function can dramatically slow things down.  One way to
	increase performance is to note when the mirror is in-sync - then
	is_remote_recovering can return 0 without having to send the request
	around the cluster.  (This has already been done.)  This greatly speeds up
	I/O during nominal mirror operation.  However, I/O during mirror resyncing
	is still greatly reduced.  The problem is that the cluster network is
	consumed with handling 'is_remote_recovering' calls that it becomes hard
	to actually do the recovery.  The fix is to only allow one
	is_remote_recovering call to go to the cluster every 1/4 sec.  When the
	call goes up to userspace, it also retrieves info about how far along
	the resync is.  If a request is determined to already be in sync by that
	info, then the region is not recovering and can safely be answered without
	having to send the request on to the cluster.  This approach has greatly
	improved both the recovery and nominal throughput.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-clog.c.diff?cvsroot=cluster&only_with_tag=RHEL5&r1=1.2.2.9&r2=1.2.2.10

--- cluster/cmirror-kernel/src/dm-clog.c	2008/02/08 14:21:04	1.2.2.9
+++ cluster/cmirror-kernel/src/dm-clog.c	2008/02/13 15:06:23	1.2.2.10
@@ -23,7 +23,7 @@
 	char *ctr_str; /* Gives ability to restart if userspace dies */
 	uint32_t ctr_size;
 
-	uint32_t in_sync_hint;
+	uint64_t in_sync_hint;
 
 	spinlock_t flush_lock;
 	struct list_head flush_list;  /* only for clear and mark requests */
@@ -588,8 +588,8 @@
 	if (r)
 		return 0;
 
-	if (sync_count == lc->region_count)
-		lc->in_sync_hint = 1;
+	if (sync_count >= lc->region_count)
+		lc->in_sync_hint = lc->region_count;
 	/*
 	 * get_sync_count is never called after the
 	 * initial sync=1
@@ -644,9 +644,10 @@
 static int cluster_is_remote_recovering(struct dirty_log *log, region_t region)
 {
 	int r;
-	int is_recovering;
-	int rdata_size;
 	struct log_c *lc = (struct log_c *)log->context;
+	static unsigned long long limit = 0;
+	struct { int is_recovering; uint64_t sync_search; } pkg;
+	int rdata_size = sizeof(pkg);
 
 	/*
 	 * Once the mirror has been reported to be in-sync,
@@ -655,14 +656,21 @@
 	 * recovering if the device is in-sync.  (in_sync_hint
 	 * must be reset at resume time.)
 	 */
-	if (lc->in_sync_hint)
+	if (region < lc->in_sync_hint)
 		return 0;
+	else if (jiffies < limit)
+		return 1;
 
-	rdata_size = sizeof(is_recovering);
+	limit = jiffies + (HZ / 4);
 	r = cluster_do_request(lc, lc->uuid, DM_CLOG_IS_REMOTE_RECOVERING,
 			       (char *)&region, sizeof(region),
-			       (char *)&is_recovering, &rdata_size);
-	return (r) ? 1 : is_recovering;
+			       (char *)&pkg, &rdata_size);
+	if (r)
+		return 1;
+
+	lc->in_sync_hint = pkg.sync_search;
+
+	return pkg.is_recovering;
 }
 
 static struct dirty_log_type _clustered_core_type = {