cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-19 14:40 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-19 14:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	jbrassow at sourceware.org	2006-07-19 14:40:15

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- Fix for:
	198563 ??? clvmd panic in dm_mod:resize_pool while ...
	198659 ??? slab error in kmem_cache_destroy() on m ...
	
	The log server was not informing the device-mapper core of its
	intentions to use its I/O interfaces.  This caused device-mapper
	to prematurely release the resources.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.1.4.3&r2=1.1.4.4

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/29 19:49:32	1.1.4.3
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/19 14:40:15	1.1.4.4
@@ -1084,6 +1084,7 @@
 		DMWARN("Cluster mirror log server thread failed to start");
 		return -1;
 	}
+	dm_io_get(32);
 	return 0;
 }
 
@@ -1092,6 +1093,7 @@
 	atomic_set(&server_run, 0);
 
 	wait_for_completion(&server_completion);
+	dm_io_put(32);
 }
 /*
  * Overrides for Emacs so that we follow Linus's tabbing style.



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-10-26 18:46 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-10-26 18:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2007-10-26 18:46:11

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	-take out annoying message.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.41&r2=1.1.2.42

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/10/03 19:02:51	1.1.2.41
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/10/26 18:46:10	1.1.2.42
@@ -637,9 +637,6 @@
 		}
 	}
 
-	DMDEBUG("Priority recovery region: %Lu/%s",
-		lc->recovering_next, lc->uuid + (strlen(lc->uuid) - 8));
-
 	if ((lc->recovering_next != (uint64_t)-1) &&
 	    (!log_test_bit(lc->sync_bits, lc->recovering_next))) {
 		new = mempool_alloc(region_user_pool, GFP_NOFS);



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-04-17 19:49 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-04-17 19:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2007-04-17 20:49:11

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- printing out a message as DMERR when the case was valid
	switching to DMDEBUG

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.32&r2=1.1.2.33

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/10 18:09:09	1.1.2.32
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/17 19:49:11	1.1.2.33
@@ -706,8 +706,8 @@
 	}
 
 	if (!ru) {
-		DMERR("Unable to find region to be marked out-of-sync: %Lu/%s/%u",
-		      lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8), who);
+		DMDEBUG("Unable to find region to be marked out-of-sync: %Lu/%s/%u",
+			lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8), who);
 		/*
 		 * This is a valid case, when the following happens:
 		 * 1) a region is recovering and has waiting writes



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-04-10 18:10 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-04-10 18:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL45
Changes by:	jbrassow at sourceware.org	2007-04-10 19:10:42

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	Remove overzealous BUG() statement.
	
	Comments (justification) in-line.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.26.2.5&r2=1.1.2.26.2.6

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/10 07:13:15	1.1.2.26.2.5
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/10 18:10:42	1.1.2.26.2.6
@@ -708,7 +708,18 @@
 	if (!ru) {
 		DMERR("Unable to find region to be marked out-of-sync: %Lu/%s/%u",
 		      lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8), who);
-		BUG();
+		/*
+		 * This is a valid case, when the following happens:
+		 * 1) a region is recovering and has waiting writes
+		 * 2) recovery fails and calls complete_resync_work (w/ failure)
+		 * 2.1) RU is removed from our list
+		 * 3) waiting writes are released
+		 * 3.1) writes do not mark, because b/c region state != RH_CLEAN
+		 * 4) write fails and calls complete_resync_work (w/ failure)
+		 * 5) boom, we are here.
+		 *
+		 * Not a bug to be here
+		 */
 	} else 	if (ru->ru_rw == RU_RECOVER) {
 		if (lr->u.lr_region != lc->recovering_region) {
 			DMERR("Recovering region mismatch from node %u: (%Lu/%Lu)",



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-04-10 18:09 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-04-10 18:09 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2007-04-10 19:09:09

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	Remove an overzealous BUG statement.
	
	Comments inlined

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.31&r2=1.1.2.32

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/10 07:12:24	1.1.2.31
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/10 18:09:09	1.1.2.32
@@ -708,7 +708,18 @@
 	if (!ru) {
 		DMERR("Unable to find region to be marked out-of-sync: %Lu/%s/%u",
 		      lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8), who);
-		BUG();
+		/*
+		 * This is a valid case, when the following happens:
+		 * 1) a region is recovering and has waiting writes
+		 * 2) recovery fails and calls complete_resync_work (w/ failure)
+		 * 2.1) RU is removed from our list
+		 * 3) waiting writes are released
+		 * 3.1) writes do not mark, because b/c region state != RH_CLEAN
+		 * 4) write fails and calls complete_resync_work (w/ failure)
+		 * 5) boom, we are here.
+		 *
+		 * Not a bug to be here
+		 */
 	} else 	if (ru->ru_rw == RU_RECOVER) {
 		if (lr->u.lr_region != lc->recovering_region) {
 			DMERR("Recovering region mismatch from node %u: (%Lu/%Lu)",



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-04-04 21:36 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-04-04 21:36 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL45
Changes by:	jbrassow at sourceware.org	2007-04-04 22:36:02

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	Bug 235252: cmirror synchronization deadlocked waiting for response fro...
	
	Moved the check for recovery/write conflict to flush from mark_region
	to avoid potential conflicts that were causing writes to indefinitly
	hang on failure conditions.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL45&r1=1.1.2.26.2.2&r2=1.1.2.26.2.3

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/03 18:23:01	1.1.2.26.2.2
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/04 21:36:01	1.1.2.26.2.3
@@ -223,10 +223,13 @@
 	return count;
 }
 
+struct region_user *find_ru_by_region(struct log_c *lc, region_t region);
 static int _core_get_resync_work(struct log_c *lc, region_t *region)
 {
+	int sync_search, conflict = 0;
+
 	if (lc->recovering_region != (uint64_t)-1) {
-		DMDEBUG("Someone is already recovering (%Lu)", lc->recovering_region);
+		DMDEBUG("Someone is already recovering region %Lu", lc->recovering_region);
 		return 0;
 	}
 
@@ -242,16 +245,27 @@
 			return 0;
 		}
 	}
-	*region = ext2_find_next_zero_bit((unsigned long *) lc->sync_bits,
-					  lc->region_count,
-					  lc->sync_search);
-	lc->sync_search = *region + 1;
+	for (sync_search = lc->sync_search;
+	     sync_search < lc->region_count;
+	     sync_search = (*region + 1)) {
+		*region = ext2_find_next_zero_bit((unsigned long *) lc->sync_bits,
+						  lc->region_count,
+						  sync_search);
+		if (find_ru_by_region(lc, *region)) {
+			conflict = 1;
+			DMDEBUG("Recovery blocked by outstanding write on region %Lu",
+			      *region);
+		} else {
+			break;
+		}
+	}
+	if (!conflict)
+		lc->sync_search = *region + 1;
 
 	if (*region >= lc->region_count)
 		return 0;
 
 	lc->recovering_region = *region;
-	DMDEBUG("Assigning recovery work: %Lu", *region);
 	return 1;
 }
 
@@ -374,6 +388,8 @@
 			bad_count++;
 			log_clear_bit(lc, lc->sync_bits, ru->ru_region);
 			if (ru->ru_rw == RU_RECOVER) {
+				DMINFO("Failed node was recovering region %Lu - cleared",
+				       ru->ru_region);
 				lc->recovering_region = (uint64_t)-1;
 			}
 			list_del(&ru->ru_list);
@@ -523,14 +539,19 @@
 		log_clear_bit(lc, lc->clean_bits, lr->u.lr_region);
 		list_add(&new->ru_list, &lc->region_users);
 	} else if (ru->ru_rw == RU_RECOVER) {
+		/*
+		 * The flush will block if a write conflicts with a
+		 * recovering region.  In the meantime, we add this
+		 * entry to the tail of the list so the recovery
+		 * gets cleared first.
+		 */
 		DMDEBUG("Attempt to mark a region " SECTOR_FORMAT
 		      "/%s which is being recovered.",
 		       lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8));
 		DMDEBUG("Current recoverer: %u", ru->ru_nodeid);
 		DMDEBUG("Mark requester   : %u", who);
-
-		mempool_free(new, region_user_pool);
-		return -EBUSY;
+		log_clear_bit(lc, lc->clean_bits, lr->u.lr_region);
+		list_add_tail(&new->ru_list, &lc->region_users);
 	} else if (!find_ru(lc, who, lr->u.lr_region)) {
 		list_add(&new->ru_list, &ru->ru_list);
 	} else {
@@ -569,6 +590,34 @@
 static int server_flush(struct log_c *lc)
 {
 	int r = 0;
+	int count = 0;
+	struct region_user *ru, *ru2;
+
+	if (lc->recovering_region != (uint64_t)-1) {
+		list_for_each_entry(ru, &lc->region_users, ru_list)
+			if (ru->ru_region == lc->recovering_region)
+				count++;
+
+		if (count > 1) {
+			list_for_each_entry(ru, &lc->region_users, ru_list)
+				if (ru->ru_rw == RU_RECOVER)
+					break;
+
+			DMDEBUG("Flush includes region which is being recovered (%u/%Lu).  Delaying...",
+				ru->ru_nodeid, ru->ru_region);
+			DMDEBUG("Recovering region: %Lu", lc->recovering_region);
+			DMDEBUG("  sync_bit: %s, clean_bit: %s",
+				log_test_bit(lc->sync_bits, lc->recovering_region) ? "set" : "unset",
+				log_test_bit(lc->clean_bits, lc->recovering_region) ? "set" : "unset");
+
+			list_for_each_entry(ru2, &lc->region_users, ru_list)
+				if (ru->ru_region == ru2->ru_region)
+					DMDEBUG("  %s", (ru2->ru_rw == RU_RECOVER) ? "recover" :
+						(ru2->ru_rw == RU_WRITE) ? "writer" : "unknown");
+
+			return -EBUSY;
+		}
+	}
 
 	r = write_bits(lc);
 	if (!r) {
@@ -597,6 +646,7 @@
 		new->ru_region = lr->u.lr_region_rtn;
 		new->ru_rw = RU_RECOVER;
 		list_add(&new->ru_list, &lc->region_users);
+		DMDEBUG("Assigning recovery work to %u: %Lu", who, new->ru_region);
 	} else {
 		mempool_free(new, region_user_pool);
 	}
@@ -624,6 +674,9 @@
 			log_set_bit(lc, lc->sync_bits, lr->u.lr_region);
 			lc->sync_count++;
 		}
+		lc->sync_pass = 0;
+
+		DMDEBUG("Resync work completed: %Lu", lr->u.lr_region);
 	} else if (log_test_bit(lc->sync_bits, lr->u.lr_region)) {
 		/* gone again: lc->sync_count--;*/
 		log_clear_bit(lc, lc->sync_bits, lr->u.lr_region);



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2007-04-04 21:35 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-04-04 21:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2007-04-04 22:35:24

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	Bug 235252: cmirror synchronization deadlocked waiting for response fro...
	
	Moved the check for recovery/write conflict to flush from mark_region
	to avoid potential conflicts that were causing writes to indefinitly
	hang on failure conditions.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.28&r2=1.1.2.29

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/03 18:21:10	1.1.2.28
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2007/04/04 21:35:23	1.1.2.29
@@ -223,10 +223,13 @@
 	return count;
 }
 
+struct region_user *find_ru_by_region(struct log_c *lc, region_t region);
 static int _core_get_resync_work(struct log_c *lc, region_t *region)
 {
+	int sync_search, conflict = 0;
+
 	if (lc->recovering_region != (uint64_t)-1) {
-		DMDEBUG("Someone is already recovering (%Lu)", lc->recovering_region);
+		DMDEBUG("Someone is already recovering region %Lu", lc->recovering_region);
 		return 0;
 	}
 
@@ -242,16 +245,27 @@
 			return 0;
 		}
 	}
-	*region = ext2_find_next_zero_bit((unsigned long *) lc->sync_bits,
-					  lc->region_count,
-					  lc->sync_search);
-	lc->sync_search = *region + 1;
+	for (sync_search = lc->sync_search;
+	     sync_search < lc->region_count;
+	     sync_search = (*region + 1)) {
+		*region = ext2_find_next_zero_bit((unsigned long *) lc->sync_bits,
+						  lc->region_count,
+						  sync_search);
+		if (find_ru_by_region(lc, *region)) {
+			conflict = 1;
+			DMDEBUG("Recovery blocked by outstanding write on region %Lu",
+			      *region);
+		} else {
+			break;
+		}
+	}
+	if (!conflict)
+		lc->sync_search = *region + 1;
 
 	if (*region >= lc->region_count)
 		return 0;
 
 	lc->recovering_region = *region;
-	DMDEBUG("Assigning recovery work: %Lu", *region);
 	return 1;
 }
 
@@ -374,6 +388,8 @@
 			bad_count++;
 			log_clear_bit(lc, lc->sync_bits, ru->ru_region);
 			if (ru->ru_rw == RU_RECOVER) {
+				DMINFO("Failed node was recovering region %Lu - cleared",
+				       ru->ru_region);
 				lc->recovering_region = (uint64_t)-1;
 			}
 			list_del(&ru->ru_list);
@@ -523,14 +539,19 @@
 		log_clear_bit(lc, lc->clean_bits, lr->u.lr_region);
 		list_add(&new->ru_list, &lc->region_users);
 	} else if (ru->ru_rw == RU_RECOVER) {
+		/*
+		 * The flush will block if a write conflicts with a
+		 * recovering region.  In the meantime, we add this
+		 * entry to the tail of the list so the recovery
+		 * gets cleared first.
+		 */
 		DMDEBUG("Attempt to mark a region " SECTOR_FORMAT
 		      "/%s which is being recovered.",
 		       lr->u.lr_region, lc->uuid + (strlen(lc->uuid) - 8));
 		DMDEBUG("Current recoverer: %u", ru->ru_nodeid);
 		DMDEBUG("Mark requester   : %u", who);
-
-		mempool_free(new, region_user_pool);
-		return -EBUSY;
+		log_clear_bit(lc, lc->clean_bits, lr->u.lr_region);
+		list_add_tail(&new->ru_list, &lc->region_users);
 	} else if (!find_ru(lc, who, lr->u.lr_region)) {
 		list_add(&new->ru_list, &ru->ru_list);
 	} else {
@@ -569,6 +590,34 @@
 static int server_flush(struct log_c *lc)
 {
 	int r = 0;
+	int count = 0;
+	struct region_user *ru, *ru2;
+
+	if (lc->recovering_region != (uint64_t)-1) {
+		list_for_each_entry(ru, &lc->region_users, ru_list)
+			if (ru->ru_region == lc->recovering_region)
+				count++;
+
+		if (count > 1) {
+			list_for_each_entry(ru, &lc->region_users, ru_list)
+				if (ru->ru_rw == RU_RECOVER)
+					break;
+
+			DMDEBUG("Flush includes region which is being recovered (%u/%Lu).  Delaying...",
+				ru->ru_nodeid, ru->ru_region);
+			DMDEBUG("Recovering region: %Lu", lc->recovering_region);
+			DMDEBUG("  sync_bit: %s, clean_bit: %s",
+				log_test_bit(lc->sync_bits, lc->recovering_region) ? "set" : "unset",
+				log_test_bit(lc->clean_bits, lc->recovering_region) ? "set" : "unset");
+
+			list_for_each_entry(ru2, &lc->region_users, ru_list)
+				if (ru->ru_region == ru2->ru_region)
+					DMDEBUG("  %s", (ru2->ru_rw == RU_RECOVER) ? "recover" :
+						(ru2->ru_rw == RU_WRITE) ? "writer" : "unknown");
+
+			return -EBUSY;
+		}
+	}
 
 	r = write_bits(lc);
 	if (!r) {
@@ -597,6 +646,7 @@
 		new->ru_region = lr->u.lr_region_rtn;
 		new->ru_rw = RU_RECOVER;
 		list_add(&new->ru_list, &lc->region_users);
+		DMDEBUG("Assigning recovery work to %u: %Lu", who, new->ru_region);
 	} else {
 		mempool_free(new, region_user_pool);
 	}
@@ -624,6 +674,9 @@
 			log_set_bit(lc, lc->sync_bits, lr->u.lr_region);
 			lc->sync_count++;
 		}
+		lc->sync_pass = 0;
+
+		DMDEBUG("Resync work completed: %Lu", lr->u.lr_region);
 	} else if (log_test_bit(lc->sync_bits, lr->u.lr_region)) {
 		/* gone again: lc->sync_count--;*/
 		log_clear_bit(lc, lc->sync_bits, lr->u.lr_region);



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-22 22:51 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-22 22:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	STABLE
Changes by:	jbrassow at sourceware.org	2006-07-22 22:51:55

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	while pulling debugging prints from the last patch, I pulled an
	important line by accident.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.1.4.5&r2=1.1.4.6

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:19:04	1.1.4.5
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:51:55	1.1.4.6
@@ -934,8 +934,8 @@
 		}
 
 		/* ATTENTION -- if error? */
-/*
 		if(error){
+/*
 			DMWARN("Error (%d) while processing request (%s)",
 			       error,
 			       (lr.lr_type == LRT_IS_CLEAN)? "LRT_IS_C	LEAN":
@@ -948,9 +948,9 @@
 			       (lr.lr_type == LRT_MASTER_LEAVING)? "LRT_MASTER_LEAVING":
 			       (lr.lr_type == LRT_ELECTION)? "LRT_ELECTION":
 			       (lr.lr_type == LRT_SELECTION)? "LRT_SELECTION": "UNKNOWN");
+*/
 			lr.u.lr_int_rtn = error;
 		}
-*/
 	reply:
     
 		/* Why do we need to reset this? */



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-22 22:50 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-22 22:50 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2006-07-22 22:50:38

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	while pulling debugging prints from the last patch, I pulled an
	important line by accident.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.14&r2=1.1.2.15

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:19:34	1.1.2.14
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:50:38	1.1.2.15
@@ -908,8 +908,8 @@
 		}
 
 		/* ATTENTION -- if error? */
-/*
 		if(error){
+/*
 			DMWARN("Error (%d) while processing request (%s)",
 			       error,
 			       (lr.lr_type == LRT_IS_CLEAN)? "LRT_IS_C	LEAN":
@@ -922,9 +922,9 @@
 			       (lr.lr_type == LRT_MASTER_LEAVING)? "LRT_MASTER_LEAVING":
 			       (lr.lr_type == LRT_ELECTION)? "LRT_ELECTION":
 			       (lr.lr_type == LRT_SELECTION)? "LRT_SELECTION": "UNKNOWN");
+*/
 			lr.u.lr_int_rtn = error;
 		}
-*/
 	reply:
     
 		/* Why do we need to reset this? */



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-22 22:49 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-22 22:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U4
Changes by:	jbrassow at sourceware.org	2006-07-22 22:49:49

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	while pulling debugging prints from the last patch, I pulled an
	important line by accident.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.9.2.5&r2=1.1.2.9.2.6

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:12:32	1.1.2.9.2.5
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/22 22:49:49	1.1.2.9.2.6
@@ -908,8 +908,8 @@
 		}
 
 		/* ATTENTION -- if error? */
-/*
 		if(error){
+/*
 			DMWARN("Error (%d) while processing request (%s)",
 			       error,
 			       (lr.lr_type == LRT_IS_CLEAN)? "LRT_IS_C	LEAN":
@@ -922,9 +922,9 @@
 			       (lr.lr_type == LRT_MASTER_LEAVING)? "LRT_MASTER_LEAVING":
 			       (lr.lr_type == LRT_ELECTION)? "LRT_ELECTION":
 			       (lr.lr_type == LRT_SELECTION)? "LRT_SELECTION": "UNKNOWN");
+*/
 			lr.u.lr_int_rtn = error;
 		}
-*/
 	reply:
     
 		/* Why do we need to reset this? */



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-19 14:39 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-19 14:39 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2006-07-19 14:39:13

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- Fix for:
	198563 ??? clvmd panic in dm_mod:resize_pool while ...
	198659 ??? slab error in kmem_cache_destroy() on m ...
	
	The log server was not informing the device-mapper core of its
	intentions to use its I/O interfaces.  This caused device-mapper
	to prematurely release the resources.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.12&r2=1.1.2.13

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/29 19:48:01	1.1.2.12
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/19 14:39:12	1.1.2.13
@@ -1058,6 +1058,7 @@
 		DMWARN("Cluster mirror log server thread failed to start");
 		return -1;
 	}
+	dm_io_get(32);
 	return 0;
 }
 
@@ -1066,6 +1067,7 @@
 	atomic_set(&server_run, 0);
 
 	wait_for_completion(&server_completion);
+	dm_io_put(32);
 }
 /*
  * Overrides for Emacs so that we follow Linus's tabbing style.



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-07-19 14:38 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-19 14:38 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U4
Changes by:	jbrassow at sourceware.org	2006-07-19 14:38:20

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- Fix for:
	198563 ??? clvmd panic in dm_mod:resize_pool while ...
	198659 ??? slab error in kmem_cache_destroy() on m ...
	
	The log server was not informing the device-mapper core of its
	intentions to use its I/O interfaces.  This caused device-mapper
	to prematurely release the resources.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.9.2.3&r2=1.1.2.9.2.4

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/29 19:46:37	1.1.2.9.2.3
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/07/19 14:38:20	1.1.2.9.2.4
@@ -1058,6 +1058,7 @@
 		DMWARN("Cluster mirror log server thread failed to start");
 		return -1;
 	}
+	dm_io_get(32);
 	return 0;
 }
 
@@ -1066,6 +1067,7 @@
 	atomic_set(&server_run, 0);
 
 	wait_for_completion(&server_completion);
+	dm_io_put(32);
 }
 /*
  * Overrides for Emacs so that we follow Linus's tabbing style.



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-06-27 20:26 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-27 20:26 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U4
Changes by:	jbrassow at sourceware.org	2006-06-27 20:26:02

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- I was incorrectly decrementing the sync count when a mirror region
	went out of sync.  We need to properly log the reset in the bits,
	but still allow nodes to detect the failure of the device and be
	able to switch the primary device.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.9.2.1&r2=1.1.2.9.2.2

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/21 21:20:59	1.1.2.9.2.1
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/27 20:26:02	1.1.2.9.2.2
@@ -586,7 +586,7 @@
 		}
 	} else if (log_test_bit(lc->sync_bits, lr->u.lr_region)) {
 		DMERR("complete_resync_work region going out-of-sync: disk failure");
-		lc->sync_count--;
+		/* gone for now: lc->sync_count--; */
 		log_clear_bit(lc, lc->sync_bits, lr->u.lr_region);
 	}
 



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-06-27 20:25 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-27 20:25 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2006-06-27 20:24:59

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- I was incorrectly decrementing the sync count when a mirror region
	went out of sync.  We need to properly log the reset in the bits,
	but still allow nodes to detect the failure of the device and be
	able to switch the primary device.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.10&r2=1.1.2.11

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/21 21:09:49	1.1.2.10
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/27 20:24:59	1.1.2.11
@@ -586,7 +586,7 @@
 		}
 	} else if (log_test_bit(lc->sync_bits, lr->u.lr_region)) {
 		DMERR("complete_resync_work region going out-of-sync: disk failure");
-		lc->sync_count--;
+		/* gone for now: lc->sync_count--; */
 		log_clear_bit(lc, lc->sync_bits, lr->u.lr_region);
 	}
 



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-06-21 21:21 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-21 21:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4U4
Changes by:	jbrassow at sourceware.org	2006-06-21 21:20:59

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- fix for bug 195610 (renaming a clustered mirror is broken)
	
	The problem is that LVM2/device-mapper calls the mirror constructor
	before calling the destructor.  This results in two copies of the
	log context to exist in the cluster mirror.
	
	I've added code to detect and handle this, but it should probably
	also be fixed in LVM2/device-mapper.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.9&r2=1.1.2.9.2.1

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/14 22:14:55	1.1.2.9
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/21 21:20:59	1.1.2.9.2.1
@@ -298,7 +298,6 @@
 	DMINFO("Disk Resume::");
 
 	debug_disk_write = 1;
-
 	memset(live_nodes, 0, sizeof(live_nodes));
 	for(i = 0; i < global_count; i++){
 		live_nodes[global_nodeids[i]/8] |= 1 << (global_nodeids[i]%8);
@@ -612,15 +611,19 @@
 
 
 static struct log_c *get_log_context(char *uuid){
-	struct log_c *lc;
+	struct log_c *lc, *r = NULL;
 
 	list_for_each_entry(lc, &log_list_head, log_list){
 		if(!strncmp(lc->uuid, uuid, MAX_NAME_LEN)){
-			return lc;
+			if (r)
+				DMERR("HEY!!! There are two matches for %s",
+				      lc->uuid + (strlen(lc->uuid) - 8));
+			else
+				r = lc;
 		}
 	}
 
-	return NULL;
+	return r;
 }
 
 



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-06-21 21:09 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-21 21:09 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2006-06-21 21:09:49

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- fix for bug 195610 (renaming a clustered mirror is broken)
	
	The problem is that LVM2/device-mapper calls the mirror constructor
	before calling the destructor.  This results in two copies of the
	log context to exist in the cluster mirror.
	
	I've added code to detect and handle this, but it should probably
	also be fixed in LVM2/device-mapper.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.9&r2=1.1.2.10

--- cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/14 22:14:55	1.1.2.9
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-server.c	2006/06/21 21:09:49	1.1.2.10
@@ -298,7 +298,6 @@
 	DMINFO("Disk Resume::");
 
 	debug_disk_write = 1;
-
 	memset(live_nodes, 0, sizeof(live_nodes));
 	for(i = 0; i < global_count; i++){
 		live_nodes[global_nodeids[i]/8] |= 1 << (global_nodeids[i]%8);
@@ -612,15 +611,19 @@
 
 
 static struct log_c *get_log_context(char *uuid){
-	struct log_c *lc;
+	struct log_c *lc, *r = NULL;
 
 	list_for_each_entry(lc, &log_list_head, log_list){
 		if(!strncmp(lc->uuid, uuid, MAX_NAME_LEN)){
-			return lc;
+			if (r)
+				DMERR("HEY!!! There are two matches for %s",
+				      lc->uuid + (strlen(lc->uuid) - 8));
+			else
+				r = lc;
 		}
 	}
 
-	return NULL;
+	return r;
 }
 
 



^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c
@ 2006-06-14 22:14 jbrassow
  0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-14 22:14 UTC (permalink / raw)
  To: cluster-devel.redhat.com

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	jbrassow at sourceware.org	2006-06-14 22:14:55

Modified files:
	cmirror-kernel/src: dm-cmirror-server.c 

Log message:
	- error messages that really should be debug messages.

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-server.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.8&r2=1.1.2.9



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-10-26 18:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-19 14:40 [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-server.c jbrassow
  -- strict thread matches above, loose matches on Subject: below --
2007-10-26 18:46 jbrassow
2007-04-17 19:49 jbrassow
2007-04-10 18:10 jbrassow
2007-04-10 18:09 jbrassow
2007-04-04 21:36 jbrassow
2007-04-04 21:35 jbrassow
2006-07-22 22:51 jbrassow
2006-07-22 22:50 jbrassow
2006-07-22 22:49 jbrassow
2006-07-19 14:39 jbrassow
2006-07-19 14:38 jbrassow
2006-06-27 20:26 jbrassow
2006-06-27 20:25 jbrassow
2006-06-21 21:21 jbrassow
2006-06-21 21:09 jbrassow
2006-06-14 22:14 jbrassow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).