* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-11-27 23:15 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-11-27 23:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-11-27 23:15:45
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Bug 217438: scrolling kernel requests to mark mirror regions
It is sometimes possible for a mark request to be issued on a region
that is being recovered. The mark must be delayed until the recovery
is complete. This is what was happening, but a rather severe error
was being reported and the (mark region) retries were happening too
fast.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.28&r2=1.1.2.29
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/27 22:36:48 1.1.2.28
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/27 23:15:43 1.1.2.29
@@ -941,9 +941,19 @@
spin_unlock(®ion_state_lock);
while((error = consult_server(lc, region, LRT_MARK_REGION, NULL))){
+ if (error == -EBUSY) {
+ /* Remote recovering delay and try again */
+ DMDEBUG("Delaying mark to region %Lu, due to recovery",
+ region);
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(HZ/2);
+ continue;
+ }
+
DMWARN("unable to get server (%u) to mark region (%Lu)",
lc->server_id, region);
DMWARN("Reason :: %d", error);
+
if (error == -EIO) {
lc->log_dev_failed = 1;
break;
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2007-08-23 16:51 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-08-23 16:51 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2007-08-23 16:51:39
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
BUG 239856 Processed: failed server election due to suspended mirror...
When two logs are loaded that represent the same entity, the second
log should start out suspended.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.49&r2=1.1.2.50
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/07/11 16:18:03 1.1.2.49
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/08/23 16:51:39 1.1.2.50
@@ -586,6 +586,11 @@
if(!strncmp(tmp_lc->uuid, lc->uuid, MAX_NAME_LEN)){
lc->uuid_ref = (lc->uuid_ref > tmp_lc->uuid_ref) ?
lc->uuid_ref : tmp_lc->uuid_ref + 1;
+ /*
+ * A second instance of the same log
+ * should start out suspended.
+ */
+ atomic_set(&lc->suspended, 1);
}
}
@@ -601,7 +606,6 @@
INIT_LIST_HEAD(&lc->mark_logged);
spin_lock_init(&lc->state_lock);
- atomic_set(&lc->suspended, 1);
lc->server_valid = 0;
lc->server_id = 0xDEAD;
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2007-05-09 21:44 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-05-09 21:44 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2007-05-09 21:44:34
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Bug 235040 is fixed in the 4.6 kernel, no need to work around it now.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.47&r2=1.1.2.48
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/04/26 16:54:49 1.1.2.47
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/05/09 21:44:34 1.1.2.48
@@ -756,13 +756,8 @@
DMERR("Unable to disconnect from cluster infrastructure.\n");
}
-static int cluster_flush(struct dirty_log *log);
static int cluster_presuspend(struct dirty_log *log)
{
- /* FIXME: flush is work-around for bug 235040 */
- DMDEBUG("Performing flush to work around bug 235040");
- cluster_flush(log);
- DMDEBUG("Log flush complete");
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2007-03-02 22:31 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2007-03-02 22:31 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2007-03-02 22:31:14
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- When a cluster log resumes, the server must re-read the log device.
However, it doesn't _need_ to happen until a request comes in which
forces an election.
If a resume/suspend loop is tight enough, it is possible for a
log read to happen after the log device has been suspended.
The fix is to force the log read during the resume call.
Bug 227398: cmirror request to LRT_CLEAR_REGION fails...
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.39&r2=1.1.2.40
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/02/26 17:38:06 1.1.2.39
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2007/03/02 22:31:14 1.1.2.40
@@ -305,7 +305,8 @@
lc->server_id = lr.u.lr_coordinator;
} else {
/* ATTENTION -- what do we do with this ? */
- DMWARN("Failed to receive election results from server: %d", len);
+ DMWARN("Failed to receive election results from server: (%s,%d)",
+ lc->uuid + (strlen(lc->uuid) - 8), len);
error = len;
}
@@ -464,6 +465,8 @@
(type == LRT_ELECTION)? "LRT_ELECTION":
(type == LRT_SELECTION)? "LRT_SELECTION": "UNKNOWN");
DMDEBUG(" - error :: %d", error);
+ DMINFO("Too many retries, attempting to re-establish server connection.");
+ lc->server_id = 0xDEAD;
}
}
@@ -764,11 +767,6 @@
list_del_init(&lc->log_list);
spin_unlock(&log_list_lock);
- if ((lc->server_id == my_id) && !atomic_read(&lc->suspended))
- consult_server(lc, 0, LRT_MASTER_LEAVING, NULL);
-
- sock_release(lc->client_sock);
-
spin_lock(®ion_state_lock);
list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list) {
@@ -783,6 +781,10 @@
spin_unlock(®ion_state_lock);
+ if ((lc->server_id == my_id) && !atomic_read(&lc->suspended))
+ consult_server(lc, 0, LRT_MASTER_LEAVING, NULL);
+
+ sock_release(lc->client_sock);
if (lc->log_dev)
disk_dtr(log);
@@ -842,6 +844,7 @@
lc->sync_search = 0;
resume_server_requests();
atomic_set(&lc->suspended, 0);
+ consult_server(lc, 0, LRT_IN_SYNC, NULL);
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-12-05 17:49 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-12-05 17:49 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-12-05 17:49:08
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Just a comment update, so I don't forget why I'm checking for
duplicate mark region requests.
< /* ATTENTION -- this check should not be necessary. **
< ** Why are regions being marked again before a clear? */
---
> /*
> * In the mirroring code, it is possible for a write
> * to complete and call rh_dec - putting the region on
> * the clear_region list. However, before the actual
> * clear request is issued to the log (rh_update_states)
> * another mark happens. So, we check for and remove
> * duplicates.
> */
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.31&r2=1.1.2.32
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/12/05 15:08:17 1.1.2.31
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/12/05 17:49:08 1.1.2.32
@@ -918,8 +918,14 @@
return;
}
}
- /* ATTENTION -- this check should not be necessary. **
- ** Why are regions being marked again before a clear? */
+ /*
+ * In the mirroring code, it is possible for a write
+ * to complete and call rh_dec - putting the region on
+ * the clear_region list. However, before the actual
+ * clear request is issued to the log (rh_update_states)
+ * another mark happens. So, we check for and remove
+ * duplicates.
+ */
list_for_each_entry(rs, &marked_region_list, rs_list){
if(lc == rs->rs_lc && region == rs->rs_region){
#ifdef DEBUG
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-12-05 15:08 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-12-05 15:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-12-05 15:08:18
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Bug 214487: "Attempt to mark a already marked region" messages when ...
Similar to the last checkin. When a log server moved, all entries in
the clear region list were removed rather than just those that where
associated with the log that moved. Again, this would cause the server
to report that the regions where already marked when the next mark
request would come in.
I believe that this action was harmless (but annoying).
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.30&r2=1.1.2.31
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/12/05 03:13:50 1.1.2.30
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/12/05 15:08:17 1.1.2.31
@@ -503,11 +503,14 @@
DMINFO(" - Wiping clear region list");
list_for_each_entry_safe(rs, tmp_rs,
&clear_region_list, rs_list){
+ /* Remove only those associated with referenced log */
+ if (rs->rs_lc != lc)
+ continue;
i++;
list_del_init(&rs->rs_list);
mempool_free(rs, region_state_pool);
}
- clear_region_count=0;
+ clear_region_count -= i;
DMINFO(" - %d clear region requests wiped", i);
DMINFO(" - Resending all mark region requests");
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-12-05 3:13 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-12-05 3:13 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-12-05 03:13:51
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Bug 214487: "Attempt to mark a already marked region" messages when ...
When a log server moves (like durring a failure or the mirror being
shutdown on that node), the clients send the new server the regions
that they have marked.
However, the clients where sending _all_ of the regions they had
marked for every mirror that was active - even for the ones whose
log server had not moved. This was causing those servers to report
that they already had those regions marked.
Other than the annoying messages, I don't believe there were any
side affects of this bug.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.29&r2=1.1.2.30
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/27 23:15:43 1.1.2.29
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/12/05 03:13:50 1.1.2.30
@@ -512,6 +512,9 @@
DMINFO(" - Resending all mark region requests");
list_for_each_entry(rs, &marked_region_list, rs_list){
+ /* Resend only those associated with referenced log */
+ if (rs->rs_lc != lc)
+ continue;
do {
retry = 0;
DMINFO(" - " SECTOR_FORMAT, rs->rs_region);
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-11-27 22:36 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-11-27 22:36 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-11-27 22:36:48
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
Bug 214517: hung cmirror operations due to looping mirror region requests
Once we've completed handling a new server, we must reset the 'new_server'
variable, so that if we need to retry a request we don't go through the
whole process again.
Also, added more error reporting for when a request must be retried.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.27&r2=1.1.2.28
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/07 20:48:16 1.1.2.27
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/27 22:36:48 1.1.2.28
@@ -406,6 +406,7 @@
if(len <= 0){
/* ATTENTION -- what do we do with this ? */
+ DMWARN("Error while listening for server response: %d", len);
error = len;
*retry = 1;
goto fail;
@@ -420,6 +421,8 @@
}
if (lr->u.lr_int_rtn == -ENXIO) {
+ DMDEBUG("Server (%u) says it no longer controls this log (%s)",
+ lc->server_id, lc->uuid + (strlen(lc->uuid) - 8));
lc->server_id = 0xDEAD;
*retry = 1;
goto fail;
@@ -540,6 +543,7 @@
(type == LRT_SELECTION)? "LRT_SELECTION": "UNKNOWN"
);
}
+ new_server = 0;
}
rs = NULL;
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-11-07 20:48 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-11-07 20:48 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-11-07 20:48:16
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- Rather than just printing out "UNKNOWN", we print out the value
and then "unknown". This way, we can look up the value and see
if it is valid.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.26&r2=1.1.2.27
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/10/24 21:04:31 1.1.2.26
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/11/07 20:48:16 1.1.2.27
@@ -527,7 +527,7 @@
spin_unlock(®ion_state_lock);
goto out;
} else {
- DMINFO("Continuing request:: %s",
+ DMINFO("Continuing request type, %d (%s)", type,
(type == LRT_IS_CLEAN)? "LRT_IS_CLEAN":
(type == LRT_IN_SYNC)? "LRT_IN_SYNC":
(type == LRT_MARK_REGION)? "LRT_MARK_REGION":
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-10-24 21:04 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-10-24 21:04 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-10-24 21:04:31
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- Bug 194131 - cluster mirror copy status can give false (0.00) percent
An error during communication can lead to a false reporting of 0%
in-sync regions. This can't be completely eliminated until rhel5,
but we can at least retry a couple times in rhel4.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.25&r2=1.1.2.26
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/09/08 15:53:56 1.1.2.25
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/10/24 21:04:31 1.1.2.26
@@ -1040,6 +1040,7 @@
static region_t cluster_get_sync_count(struct dirty_log *log)
{
+ int i;
region_t rtn;
struct log_c *lc = (struct log_c *) log->context;
/* take out optimization
@@ -1047,7 +1048,9 @@
return lc->region_count;
}
*/
- if(consult_server(lc, 0, LRT_GET_SYNC_COUNT, &rtn)){
+ /* Try to get sync count up to five times */
+ for (i = 0; i < 5 && consult_server(lc, 0, LRT_GET_SYNC_COUNT, &rtn); i++);
+ if(i >= 5){
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-09-08 15:54 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-09-08 15:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-09-08 15:53:57
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- remove speed-bump that was put in for debugging
- delete 'marked region requests' when destructor is called
The regions have been marked on disk, but the memory struct
tracking that mark needs to be freed if a dtr comes along
before a 'clear region request'. This can happen when a
mirror image fails.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.24&r2=1.1.2.25
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/09/05 17:50:11 1.1.2.24
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/09/08 15:53:56 1.1.2.25
@@ -528,7 +528,7 @@
goto out;
} else {
DMINFO("Continuing request:: %s",
- (type == LRT_IS_CLEAN)? "LRT_IS_C LEAN":
+ (type == LRT_IS_CLEAN)? "LRT_IS_CLEAN":
(type == LRT_IN_SYNC)? "LRT_IN_SYNC":
(type == LRT_MARK_REGION)? "LRT_MARK_REGION":
(type == LRT_GET_RESYNC_WORK)? "LRT_GET_RESYNC_WORK":
@@ -732,6 +732,7 @@
static void cluster_dtr(struct dirty_log *log)
{
struct log_c *lc = (struct log_c *) log->context;
+ struct region_state *rs, *tmp_rs;
if (!list_empty(&clear_region_list))
DMINFO("Leaving while clear region requests remain.");
@@ -742,6 +743,20 @@
sock_release(lc->client_sock);
+ spin_lock(®ion_state_lock);
+
+ list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list) {
+ if (lc == rs->rs_lc)
+ list_del_init(&rs->rs_list);
+ }
+
+ list_for_each_entry_safe(rs, tmp_rs, &marked_region_list, rs_list) {
+ if (lc == rs->rs_lc)
+ list_del_init(&rs->rs_list);
+ }
+
+ spin_unlock(®ion_state_lock);
+
if (lc->log_dev)
disk_dtr(log);
else
@@ -758,7 +773,6 @@
static int cluster_postsuspend(struct dirty_log *log)
{
- int r;
struct log_c *lc = (struct log_c *) log->context;
while (1) {
@@ -876,6 +890,8 @@
rs_new = mempool_alloc(region_state_pool, GFP_KERNEL);
+ memset(rs_new, 0, sizeof(struct region_state));
+
spin_lock(®ion_state_lock);
list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list){
if(lc == rs->rs_lc && region == rs->rs_region){
@@ -950,6 +966,8 @@
rs_new = mempool_alloc(region_state_pool, GFP_ATOMIC);
+ memset(rs_new, 0, sizeof(struct region_state));
+
spin_lock(®ion_state_lock);
list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list){
@@ -968,9 +986,6 @@
list_del_init(&rs->rs_list);
list_add(&rs->rs_list, &clear_region_list);
clear_region_count++;
- if(!(clear_region_count & 0x7F)){
- DMINFO("clear_region_count :: %d", clear_region_count);
- }
spin_unlock(®ion_state_lock);
if (rs_new)
mempool_free(rs_new, region_state_pool);
@@ -1017,11 +1032,8 @@
while(consult_server(lc, region, LRT_COMPLETE_RESYNC_WORK, &success_tmp)){
DMWARN("unable to notify server of completed resync work");
}
- if (!success) {
+ if (!success)
DMERR("Attempting to revert sync status of region #%llu", region);
- set_current_state(TASK_INTERRUPTIBLE);
- schedule_timeout(HZ/5);
- }
return;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-09-08 15:52 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-09-08 15:52 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4U4
Changes by: jbrassow at sourceware.org 2006-09-08 15:52:41
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- remove speed-bump that was put in for debugging
- delete 'marked region requests' when destructor is called
The regions have been marked on disk, but the memory struct
tracking that mark needs to be freed if a dtr comes along
before a 'clear region request'. This can happen when a
mirror image fails.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.19.2.5&r2=1.1.2.19.2.6
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/09/05 17:48:02 1.1.2.19.2.5
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/09/08 15:52:40 1.1.2.19.2.6
@@ -528,7 +528,7 @@
goto out;
} else {
DMINFO("Continuing request:: %s",
- (type == LRT_IS_CLEAN)? "LRT_IS_C LEAN":
+ (type == LRT_IS_CLEAN)? "LRT_IS_CLEAN":
(type == LRT_IN_SYNC)? "LRT_IN_SYNC":
(type == LRT_MARK_REGION)? "LRT_MARK_REGION":
(type == LRT_GET_RESYNC_WORK)? "LRT_GET_RESYNC_WORK":
@@ -732,6 +732,7 @@
static void cluster_dtr(struct dirty_log *log)
{
struct log_c *lc = (struct log_c *) log->context;
+ struct region_state *rs, *tmp_rs;
if (!list_empty(&clear_region_list))
DMINFO("Leaving while clear region requests remain.");
@@ -742,6 +743,20 @@
sock_release(lc->client_sock);
+ spin_lock(®ion_state_lock);
+
+ list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list) {
+ if (lc == rs->rs_lc)
+ list_del_init(&rs->rs_list);
+ }
+
+ list_for_each_entry_safe(rs, tmp_rs, &marked_region_list, rs_list) {
+ if (lc == rs->rs_lc)
+ list_del_init(&rs->rs_list);
+ }
+
+ spin_unlock(®ion_state_lock);
+
if (lc->log_dev)
disk_dtr(log);
else
@@ -758,7 +773,6 @@
static int cluster_postsuspend(struct dirty_log *log)
{
- int r;
struct log_c *lc = (struct log_c *) log->context;
while (1) {
@@ -876,6 +890,8 @@
rs_new = mempool_alloc(region_state_pool, GFP_KERNEL);
+ memset(rs_new, 0, sizeof(struct region_state));
+
spin_lock(®ion_state_lock);
list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list){
if(lc == rs->rs_lc && region == rs->rs_region){
@@ -950,6 +966,8 @@
rs_new = mempool_alloc(region_state_pool, GFP_ATOMIC);
+ memset(rs_new, 0, sizeof(struct region_state));
+
spin_lock(®ion_state_lock);
list_for_each_entry_safe(rs, tmp_rs, &clear_region_list, rs_list){
@@ -968,9 +986,6 @@
list_del_init(&rs->rs_list);
list_add(&rs->rs_list, &clear_region_list);
clear_region_count++;
- if(!(clear_region_count & 0x7F)){
- DMINFO("clear_region_count :: %d", clear_region_count);
- }
spin_unlock(®ion_state_lock);
if (rs_new)
mempool_free(rs_new, region_state_pool);
@@ -1017,11 +1032,8 @@
while(consult_server(lc, region, LRT_COMPLETE_RESYNC_WORK, &success_tmp)){
DMWARN("unable to notify server of completed resync work");
}
- if (!success) {
+ if (!success)
DMERR("Attempting to revert sync status of region #%llu", region);
- set_current_state(TASK_INTERRUPTIBLE);
- schedule_timeout(HZ/5);
- }
return;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-07-07 17:12 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-07 17:12 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4U4
Changes by: jbrassow at sourceware.org 2006-07-07 17:12:22
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- If a cluster mirror was removed, and a new one created fast enough, the
processes could overlap such that the creation would fail to register
with cman because the remove hadn't unregistered yet (-EEXIST)
Fix for bug 197952 - cluster mirror creation fails due to an inablity to connect to cluster infrastructure
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4U4&r1=1.1.2.19.2.1&r2=1.1.2.19.2.2
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/06/29 19:46:37 1.1.2.19.2.1
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/07/07 17:12:22 1.1.2.19.2.2
@@ -1206,26 +1206,31 @@
};
static int mirror_set_count = 0; /* used to prevent multiple cluster [dis]connects */
+static DECLARE_MUTEX(cmirror_register_lock);
static int cluster_connect(void)
{
- int r;
+ int r = 0;
- if (mirror_set_count++)
- return 0;
+ down(&cmirror_register_lock);
+
+ if (mirror_set_count++) {
+ up(&cmirror_register_lock);
+ goto out;
+ }
r = kcl_register_service("clustered_log", 13, SERVICE_LEVEL_GDLM, &clog_ops,
1, NULL, &local_id);
if (r) {
- DMWARN("Couldn't register clustered_log service");
- return r;
+ DMWARN("Couldn't register clustered_log service. Reason: %d", r);
+ goto out;
}
r = start_server();
if(r){
DMWARN("Unable to start clustered log server daemon");
kcl_unregister_service(local_id);
- return r;
+ goto out;
}
r = kcl_join_service(local_id);
@@ -1236,13 +1241,19 @@
kcl_unregister_service(local_id);
}
+out:
+ up(&cmirror_register_lock);
return r;
}
static int cluster_disconnect(void)
{
- if (--mirror_set_count)
+ down(&cmirror_register_lock);
+
+ if (--mirror_set_count) {
+ up(&cmirror_register_lock);
return 0;
+ }
/* By setting 'shutting_down', the server will not be suspended **
** when a stop is received */
@@ -1251,6 +1262,7 @@
stop_server();
kcl_unregister_service(local_id);
+ up(&cmirror_register_lock);
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-07-07 17:09 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-07 17:09 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: STABLE
Changes by: jbrassow at sourceware.org 2006-07-07 17:09:54
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- If a cluster mirror was removed, and a new one created fast enough, the
processes could overlap such that the creation would fail to register
with cman because the remove hadn't unregistered yet (-EEXIST)
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=STABLE&r1=1.1.4.3&r2=1.1.4.4
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/06/29 19:49:32 1.1.4.3
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/07/07 17:09:54 1.1.4.4
@@ -1222,26 +1222,31 @@
};
static int mirror_set_count = 0; /* used to prevent multiple cluster [dis]connects */
+static DECLARE_MUTEX(cmirror_register_lock);
static int cluster_connect(void)
{
- int r;
+ int r = 0;
- if (mirror_set_count++)
- return 0;
+ down(&cmirror_register_lock);
+
+ if (mirror_set_count++) {
+ up(&cmirror_register_lock);
+ goto out;
+ }
r = kcl_register_service("clustered_log", 13, SERVICE_LEVEL_GDLM, &clog_ops,
1, NULL, &local_id);
if (r) {
- DMWARN("Couldn't register clustered_log service");
- return r;
+ DMWARN("Couldn't register clustered_log service. Reason: %d", r);
+ goto out;
}
r = start_server();
if(r){
DMWARN("Unable to start clustered log server daemon");
kcl_unregister_service(local_id);
- return r;
+ goto out;
}
r = kcl_join_service(local_id);
@@ -1252,13 +1257,19 @@
kcl_unregister_service(local_id);
}
+out:
+ up(&cmirror_register_lock);
return r;
}
static int cluster_disconnect(void)
{
- if (--mirror_set_count)
+ down(&cmirror_register_lock);
+
+ if (--mirror_set_count) {
+ up(&cmirror_register_lock);
return 0;
+ }
/* By setting 'shutting_down', the server will not be suspended **
** when a stop is received */
@@ -1267,6 +1278,7 @@
stop_server();
kcl_unregister_service(local_id);
+ up(&cmirror_register_lock);
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-07-07 17:08 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-07-07 17:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-07-07 17:08:56
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- If a cluster mirror was removed, and a new one created fast enough, the
processes could overlap such that the creation would fail to register
with cman because the remove hadn't unregistered yet (-EEXIST)
This should fix bug 197952
Waiting to commit this to rhel4u4 until the bug becomes a blocker.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.20&r2=1.1.2.21
--- cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/06/29 19:48:01 1.1.2.20
+++ cluster/cmirror-kernel/src/Attic/dm-cmirror-client.c 2006/07/07 17:08:56 1.1.2.21
@@ -1206,26 +1206,31 @@
};
static int mirror_set_count = 0; /* used to prevent multiple cluster [dis]connects */
+static DECLARE_MUTEX(cmirror_register_lock);
static int cluster_connect(void)
{
- int r;
+ int r = 0;
- if (mirror_set_count++)
- return 0;
+ down(&cmirror_register_lock);
+
+ if (mirror_set_count++) {
+ up(&cmirror_register_lock);
+ goto out;
+ }
r = kcl_register_service("clustered_log", 13, SERVICE_LEVEL_GDLM, &clog_ops,
1, NULL, &local_id);
if (r) {
- DMWARN("Couldn't register clustered_log service");
- return r;
+ DMWARN("Couldn't register clustered_log service. Reason: %d", r);
+ goto out;
}
r = start_server();
if(r){
DMWARN("Unable to start clustered log server daemon");
kcl_unregister_service(local_id);
- return r;
+ goto out;
}
r = kcl_join_service(local_id);
@@ -1236,13 +1241,19 @@
kcl_unregister_service(local_id);
}
+out:
+ up(&cmirror_register_lock);
return r;
}
static int cluster_disconnect(void)
{
- if (--mirror_set_count)
+ down(&cmirror_register_lock);
+
+ if (--mirror_set_count) {
+ up(&cmirror_register_lock);
return 0;
+ }
/* By setting 'shutting_down', the server will not be suspended **
** when a stop is received */
@@ -1251,6 +1262,7 @@
stop_server();
kcl_unregister_service(local_id);
+ up(&cmirror_register_lock);
return 0;
}
^ permalink raw reply [flat|nested] 17+ messages in thread* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-06-15 17:53 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-15 17:53 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-06-15 17:53:15
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- bah! typo.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.16&r2=1.1.2.17
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c
@ 2006-06-15 17:44 jbrassow
0 siblings, 0 replies; 17+ messages in thread
From: jbrassow @ 2006-06-15 17:44 UTC (permalink / raw)
To: cluster-devel.redhat.com
CVSROOT: /cvs/cluster
Module name: cluster
Branch: RHEL4
Changes by: jbrassow at sourceware.org 2006-06-15 17:44:56
Modified files:
cmirror-kernel/src: dm-cmirror-client.c
Log message:
- it was possible under heavy I/O and machine joining/leaving to run into
a situation where the log server leaves and we have to rerun elections.
Failing to do so would result in never making progress.
This bug only occurs if the log server fails/leaves, we need to mark a
region, and the newly appointed server also fails/leaves.
Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cmirror-kernel/src/dm-cmirror-client.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.1.2.15&r2=1.1.2.16
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-08-23 16:51 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-27 23:15 [Cluster-devel] cluster/cmirror-kernel/src dm-cmirror-client.c jbrassow
-- strict thread matches above, loose matches on Subject: below --
2007-08-23 16:51 jbrassow
2007-05-09 21:44 jbrassow
2007-03-02 22:31 jbrassow
2006-12-05 17:49 jbrassow
2006-12-05 15:08 jbrassow
2006-12-05 3:13 jbrassow
2006-11-27 22:36 jbrassow
2006-11-07 20:48 jbrassow
2006-10-24 21:04 jbrassow
2006-09-08 15:54 jbrassow
2006-09-08 15:52 jbrassow
2006-07-07 17:12 jbrassow
2006-07-07 17:09 jbrassow
2006-07-07 17:08 jbrassow
2006-06-15 17:53 jbrassow
2006-06-15 17:44 jbrassow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).