Distributed Replicated Block Device (DRBD) development
 help / color / mirror / Atom feed
* [Drbd-dev] RE: [DRBD-cvs] svn commit by phil - r2607 - trunk/drbd - The fix forthe "both nodes in WFBitMaps" issue, Ernest
@ 2006-12-20 17:46 Montrose, Ernest
  2006-12-21  8:38 ` Philipp Reisner
  0 siblings, 1 reply; 2+ messages in thread
From: Montrose, Ernest @ 2006-12-20 17:46 UTC (permalink / raw)
  To: drbd-dev

Hi Phil,
I am still seeing this issue that I reported a while back and for which
you submitted a fix (see original message below).  But essentially after
the drbd heartbeat link is disconnected and Split brain occurred, both
nodes thinks they should be the sync source.  They send their peers(each
other) to the following states:
peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk(
DUnknown -> UpToDate ) At the same time..
This state persists across a reboot and the drbd_receiver thread loops
on both nodes with:

[root@morticia ~]# tail -f /var/log/messages
Dec 19 14:12:00 morticia kernel: drbd0: [drbd0_receiver/5841]
sock_sendmsg time expired, ko = 4294965943
Dec 19 14:12:06 morticia kernel: drbd0: [drbd0_receiver/5841]
sock_sendmsg time expired, ko = 4294965942
Dec 19 14:12:12 morticia kernel: drbd0: [drbd0_receiver/5841]
sock_sendmsg time expired, ko = 4294965941
Dec 19 14:12:18 morticia kernel: drbd0: [drbd0_receiver/5841]
sock_sendmsg time expired, ko = 4294965940



-----Original Message-----
From: drbd-cvs-bounces@linbit.com [mailto:drbd-cvs-bounces@linbit.com]
On Behalf Of drbd-cvs@linbit.com
Sent: Friday, December 01, 2006 4:42 AM
To: drbd-cvs@linbit.com
Subject: [DRBD-cvs] svn commit by phil - r2607 - trunk/drbd - The fix
forthe "both nodes in WFBitMaps" issue, Ernest

Author: phil
Date: 2006-12-01 10:41:58 +0100 (Fri, 01 Dec 2006)
New Revision: 2607

Modified:
   trunk/drbd/drbd_int.h
   trunk/drbd/drbd_main.c
   trunk/drbd/drbd_receiver.c
Log:
The fix for the "both nodes in WFBitMaps" issue, Ernest reported.



Modified: trunk/drbd/drbd_int.h
===================================================================
--- trunk/drbd/drbd_int.h	2006-11-30 15:00:22 UTC (rev 2606)
+++ trunk/drbd/drbd_int.h	2006-12-01 09:41:58 UTC (rev 2607)
@@ -853,6 +853,7 @@
 	unsigned int peer_seq;
 	spinlock_t peer_seq_lock;
 	int minor;
+	unsigned long comm_bm_set; // communicated number of set bits.
 };
 
 static inline drbd_dev *minor_to_mdev(int minor)

Modified: trunk/drbd/drbd_main.c
===================================================================
--- trunk/drbd/drbd_main.c	2006-11-30 15:00:22 UTC (rev 2606)
+++ trunk/drbd/drbd_main.c	2006-12-01 09:41:58 UTC (rev 2607)
@@ -1288,7 +1288,8 @@
 			: 0;
 	}
 
-	p.uuid[UUID_SIZE] = cpu_to_be64(drbd_bm_total_weight(mdev));
+	mdev->comm_bm_set = drbd_bm_total_weight(mdev);
+	p.uuid[UUID_SIZE] = cpu_to_be64(mdev->comm_bm_set);
 	uuid_flags |= mdev->net_conf->want_lose ? 1 : 0;
 	uuid_flags |= test_bit(CRASHED_PRIMARY, &mdev->flags) ? 2 : 0;
 	p.uuid[UUID_FLAGS] = cpu_to_be64(uuid_flags);

Modified: trunk/drbd/drbd_receiver.c
===================================================================
--- trunk/drbd/drbd_receiver.c	2006-11-30 15:00:22 UTC (rev 2606)
+++ trunk/drbd/drbd_receiver.c	2006-12-01 09:41:58 UTC (rev 2607)
@@ -1668,7 +1668,7 @@
 	peer = mdev->p_uuid[Bitmap] & 1;
 
 	ch_peer = mdev->p_uuid[UUID_SIZE];
-	ch_self = drbd_bm_total_weight(mdev);
+	ch_self = mdev->comm_bm_set;
 
 	switch ( mdev->net_conf->after_sb_0p ) {
 	case Consensus:

_______________________________________________
drbd-cvs mailing list
drbd-cvs@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-cvs

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Drbd-dev] RE: [DRBD-cvs] svn commit by phil - r2607 - trunk/drbd - The fix forthe "both nodes in WFBitMaps" issue, Ernest
  2006-12-20 17:46 [Drbd-dev] RE: [DRBD-cvs] svn commit by phil - r2607 - trunk/drbd - The fix forthe "both nodes in WFBitMaps" issue, Ernest Montrose, Ernest
@ 2006-12-21  8:38 ` Philipp Reisner
  0 siblings, 0 replies; 2+ messages in thread
From: Philipp Reisner @ 2006-12-21  8:38 UTC (permalink / raw)
  To: drbd-dev

Am Mittwoch, 20. Dezember 2006 18:46 schrieb Montrose, Ernest:
> Hi Phil,
> I am still seeing this issue that I reported a while back and for which
> you submitted a fix (see original message below).  But essentially after
> the drbd heartbeat link is disconnected and Split brain occurred, both
> nodes thinks they should be the sync source.  They send their peers(each
> other) to the following states:
> peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk(
> DUnknown -> UpToDate ) At the same time..
> This state persists across a reboot and the drbd_receiver thread loops
> on both nodes with:
>


Hi Ernest,

Could you please post the kernel messages from both nodes including
the disconnect time, as well the reconnect time ?

And a "drbdadm show-gi" on both nodes from the resource in question
would be helpfull as well.

Thanks!

-Phil
-- 
: Dipl-Ing Philipp Reisner                      Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH          Fax +43-1-8178292-82 :
: Vivenotgasse 48, 1120 Vienna, Austria        http://www.linbit.com :

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-12-21  8:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-20 17:46 [Drbd-dev] RE: [DRBD-cvs] svn commit by phil - r2607 - trunk/drbd - The fix forthe "both nodes in WFBitMaps" issue, Ernest Montrose, Ernest
2006-12-21  8:38 ` Philipp Reisner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox