From mboxrd@z Thu Jan 1 00:00:00 1970 From: pcaulfield@sourceware.org Date: 20 Nov 2007 11:02:48 -0000 Subject: [Cluster-devel] cluster/cman-kernel/src cnxman.c membership.c Message-ID: <20071120110248.10031.qmail@sourceware.org> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit CVSROOT: /cvs/cluster Module name: cluster Branch: RHEL4 Changes by: pcaulfield at sourceware.org 2007-11-20 11:02:46 Modified files: cman-kernel/src: cnxman.c membership.c Log message: A fix to the last patch. The last_ackneeded_seq_recv variable needed to be cleared when the node went down, otherwise we end up comparing received sequence numbers against old ones and end up throwing all new messages away! bz#387081 shows this happening. The "inconsistent" message is slightly misleading in this context Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman-kernel/src/cnxman.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.42.2.30&r2=1.42.2.31 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/cman-kernel/src/membership.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.44.2.28&r2=1.44.2.29 --- cluster/cman-kernel/src/Attic/cnxman.c 2007/11/01 10:44:25 1.42.2.30 +++ cluster/cman-kernel/src/Attic/cnxman.c 2007/11/20 11:02:45 1.42.2.31 @@ -867,10 +867,10 @@ /* Have we received this message before ? If so just ignore it, it's a * resend for someone else's benefit */ if (!(flags & MSG_NOACK) && - rem_node && rem_node->last_seq_recv && + rem_node && rem_node->last_ackneeded_seq_recv && (short)((short)le16_to_cpu(header->seq) - (short)rem_node->last_ackneeded_seq_recv) <= 0) { - P_COMMS("Discarding message - seq = %d, last_seen = %d\n", - header->seq, rem_node->last_seq_recv); + P_COMMS("Discarding message - seq = %d, last_seen = %d, last_acked_seen = %d\n", + header->seq, rem_node->last_seq_recv, rem_node->last_ackneeded_seq_recv); /* Still need to ACK it though, in case it was the ACK that got * lost */ cl_sendack(csock, header->seq, addrlen, addr, header->tgtport, 0); --- cluster/cman-kernel/src/Attic/membership.c 2007/09/19 15:01:07 1.44.2.28 +++ cluster/cman-kernel/src/Attic/membership.c 2007/11/20 11:02:46 1.44.2.29 @@ -1398,6 +1398,7 @@ cluster_members--; node->state = NODESTATE_DEAD; node->last_seq_recv = 0; + node->last_ackneeded_seq_recv = 0; up(&cluster_members_lock); send_nodedown(node->node_id, node->leave_reason); @@ -1683,6 +1684,7 @@ newnode->us = 0; newnode->leave_reason = 0; newnode->last_seq_recv = 0; + newnode->last_ackneeded_seq_recv = 0; newnode->last_seq_acked = 0; newnode->last_seq_sent = 0; newnode->incarnation++; @@ -1717,6 +1719,7 @@ newnode->us = 0; newnode->leave_reason = 0; newnode->last_seq_recv = 0; + newnode->last_ackneeded_seq_recv = 0; newnode->last_seq_acked = 0; newnode->last_seq_sent = 0; newnode->incarnation = 0;