netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/3] tipc: three small fixes
@ 2016-07-11 20:08 Jon Maloy
  2016-07-11 20:08 ` [PATCH net 1/3] tipc: extend broadcast link initialization criteria Jon Maloy
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jon Maloy @ 2016-07-11 20:08 UTC (permalink / raw)
  To: davem
  Cc: netdev, Paul Gortmaker, parthasarathy.bhuvaragan, richard.alpe,
	ying.xue, maloy, tipc-discussion, Jon Maloy

Fixes for some broadcast link problems that may occur in large systems.

Jon Maloy (3):
  tipc: extend broadcast link initialization criteria
  tipc: ensure correct broadcast send buffer release when peer is lost
  tipc: reset all unicast links when broadcast send link fails

 net/tipc/bearer.c | 15 +++++++++++++++
 net/tipc/bearer.h |  1 +
 net/tipc/link.c   |  9 ++++++++-
 net/tipc/node.c   | 15 +++++++++++----
 4 files changed, 35 insertions(+), 5 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net 1/3] tipc: extend broadcast link initialization criteria
  2016-07-11 20:08 [PATCH net 0/3] tipc: three small fixes Jon Maloy
@ 2016-07-11 20:08 ` Jon Maloy
  2016-07-11 20:08 ` [PATCH net 2/3] tipc: ensure correct broadcast send buffer release when peer is lost Jon Maloy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jon Maloy @ 2016-07-11 20:08 UTC (permalink / raw)
  To: davem; +Cc: Jon Maloy, netdev, Paul Gortmaker, tipc-discussion

At first contact between two nodes, an endpoint might sometimes have
time to send out a LINK_PROTOCOL/STATE packet before it has received
the broadcast initialization packet from the peer, i.e., before it has
received a valid broadcast packet number to add to the 'bc_ack' field
of the protocol message.

This means that the peer endpoint will receive a protocol packet with an
invalid broadcast acknowledge value of 0. Under unlucky circumstances
this may lead to the original, already received acknowledge value being
overwritten, so that the whole broadcast link goes stale after a while.

We fix this by delaying the setting of the link field 'bc_peer_is_up'
until we know that the peer really has received our own broadcast
initialization message. The latter is always sent out as the first
unicast message on a link, and always with seqeunce number 1. Because
of this, we only need to look for a non-zero unicast acknowledge value
in the arriving STATE messages, and once that is confirmed we know we
are safe and can set the mentioned field. Before this moment, we must
ignore all broadcast acknowledges from the peer.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/link.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 67b6ab9..6483dc4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1559,7 +1559,12 @@ void tipc_link_bc_sync_rcv(struct tipc_link *l, struct tipc_msg *hdr,
 	if (!msg_peer_node_is_up(hdr))
 		return;
 
-	l->bc_peer_is_up = true;
+	/* Open when peer ackowledges our bcast init msg (pkt #1) */
+	if (msg_ack(hdr))
+		l->bc_peer_is_up = true;
+
+	if (!l->bc_peer_is_up)
+		return;
 
 	/* Ignore if peers_snd_nxt goes beyond receive window */
 	if (more(peers_snd_nxt, l->rcv_nxt + l->window))
-- 
1.9.1


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 2/3] tipc: ensure correct broadcast send buffer release when peer is lost
  2016-07-11 20:08 [PATCH net 0/3] tipc: three small fixes Jon Maloy
  2016-07-11 20:08 ` [PATCH net 1/3] tipc: extend broadcast link initialization criteria Jon Maloy
@ 2016-07-11 20:08 ` Jon Maloy
  2016-07-11 20:08 ` [PATCH net 3/3] tipc: reset all unicast links when broadcast send link fails Jon Maloy
  2016-07-12  5:43 ` [PATCH net 0/3] tipc: three small fixes David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Jon Maloy @ 2016-07-11 20:08 UTC (permalink / raw)
  To: davem; +Cc: Jon Maloy, netdev, Paul Gortmaker, tipc-discussion

After a new receiver peer has been added to the broadcast transmission
link, we allow immediate transmission of new broadcast packets, trusting
that the new peer will not accept the packets until it has received the
previously sent unicast broadcast initialiation message. In the same
way, the sender must not accept any acknowledges until it has itself
received the broadcast initialization from the peer, as well as
confirmation of the reception of its own initialization message.

Furthermore, when a receiver peer goes down, the sender has to produce
the missing acknowledges from the lost peer locally, in order ensure
correct release of the buffers that were expected to be acknowledged by
the said peer.

In a highly stressed system we have observed that contact with a peer
may come up and be lost before the above mentioned broadcast initial-
ization and confirmation have been received. This leads to the locally
produced acknowledges being rejected, and the non-acknowledged buffers
to linger in the broadcast link transmission queue until it fills up
and the link goes into permanent congestion.

In this commit, we remedy this by temporarily setting the corresponding
broadcast receive link state to ESTABLISHED and the 'bc_peer_is_up'
state to true before we issue the local acknowledges. This ensures that
those acknowledges will always be accepted. The mentioned state values
are restored immediately afterwards when the link is reset.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/link.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 6483dc4..7d89f87 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -349,6 +349,8 @@ void tipc_link_remove_bc_peer(struct tipc_link *snd_l,
 	u16 ack = snd_l->snd_nxt - 1;
 
 	snd_l->ackers--;
+	rcv_l->bc_peer_is_up = true;
+	rcv_l->state = LINK_ESTABLISHED;
 	tipc_link_bc_ack_rcv(rcv_l, ack, xmitq);
 	tipc_link_reset(rcv_l);
 	rcv_l->state = LINK_RESET;
-- 
1.9.1


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH net 3/3] tipc: reset all unicast links when broadcast send link fails
  2016-07-11 20:08 [PATCH net 0/3] tipc: three small fixes Jon Maloy
  2016-07-11 20:08 ` [PATCH net 1/3] tipc: extend broadcast link initialization criteria Jon Maloy
  2016-07-11 20:08 ` [PATCH net 2/3] tipc: ensure correct broadcast send buffer release when peer is lost Jon Maloy
@ 2016-07-11 20:08 ` Jon Maloy
  2016-07-12  5:43 ` [PATCH net 0/3] tipc: three small fixes David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: Jon Maloy @ 2016-07-11 20:08 UTC (permalink / raw)
  To: davem; +Cc: Jon Maloy, netdev, Paul Gortmaker, tipc-discussion

In test situations with many nodes and a heavily stressed system we have
observed that the transmission broadcast link may fail due to an
excessive number of retransmissions of the same packet. In such
situations we need to reset all unicast links to all peers, in order to
reset and re-synchronize the broadcast link.

In this commit, we add a new function tipc_bearer_reset_all() to be used
in such situations. The function scans across all bearers and resets all
their pertaining links.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/bearer.c | 15 +++++++++++++++
 net/tipc/bearer.h |  1 +
 net/tipc/node.c   | 15 +++++++++++----
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index bf8f05c..a597708 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -330,6 +330,21 @@ static int tipc_reset_bearer(struct net *net, struct tipc_bearer *b)
 	return 0;
 }
 
+/* tipc_bearer_reset_all - reset all links on all bearers
+ */
+void tipc_bearer_reset_all(struct net *net)
+{
+	struct tipc_net *tn = tipc_net(net);
+	struct tipc_bearer *b;
+	int i;
+
+	for (i = 0; i < MAX_BEARERS; i++) {
+		b = rcu_dereference_rtnl(tn->bearer_list[i]);
+		if (b)
+			tipc_reset_bearer(net, b);
+	}
+}
+
 /**
  * bearer_disable
  *
diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
index f686e41..60e49c3 100644
--- a/net/tipc/bearer.h
+++ b/net/tipc/bearer.h
@@ -198,6 +198,7 @@ void tipc_bearer_add_dest(struct net *net, u32 bearer_id, u32 dest);
 void tipc_bearer_remove_dest(struct net *net, u32 bearer_id, u32 dest);
 struct tipc_bearer *tipc_bearer_find(struct net *net, const char *name);
 struct tipc_media *tipc_media_find(const char *name);
+void tipc_bearer_reset_all(struct net *net);
 int tipc_bearer_setup(void);
 void tipc_bearer_cleanup(void);
 void tipc_bearer_stop(struct net *net);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index e01e2c71..23d4761 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1297,10 +1297,6 @@ static void tipc_node_bc_rcv(struct net *net, struct sk_buff *skb, int bearer_id
 
 	rc = tipc_bcast_rcv(net, be->link, skb);
 
-	/* Broadcast link reset may happen at reassembly failure */
-	if (rc & TIPC_LINK_DOWN_EVT)
-		tipc_node_reset_links(n);
-
 	/* Broadcast ACKs are sent on a unicast link */
 	if (rc & TIPC_LINK_SND_BC_ACK) {
 		tipc_node_read_lock(n);
@@ -1320,6 +1316,17 @@ static void tipc_node_bc_rcv(struct net *net, struct sk_buff *skb, int bearer_id
 		spin_unlock_bh(&be->inputq2.lock);
 		tipc_sk_mcast_rcv(net, &be->arrvq, &be->inputq2);
 	}
+
+	if (rc & TIPC_LINK_DOWN_EVT) {
+		/* Reception reassembly failure => reset all links to peer */
+		if (!tipc_link_is_up(be->link))
+			tipc_node_reset_links(n);
+
+		/* Retransmission failure => reset all links to all peers */
+		if (!tipc_link_is_up(tipc_bc_sndlink(net)))
+			tipc_bearer_reset_all(net);
+	}
+
 	tipc_node_put(n);
 }
 
-- 
1.9.1


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net 0/3] tipc: three small fixes
  2016-07-11 20:08 [PATCH net 0/3] tipc: three small fixes Jon Maloy
                   ` (2 preceding siblings ...)
  2016-07-11 20:08 ` [PATCH net 3/3] tipc: reset all unicast links when broadcast send link fails Jon Maloy
@ 2016-07-12  5:43 ` David Miller
  3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2016-07-12  5:43 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, paul.gortmaker, parthasarathy.bhuvaragan, richard.alpe,
	ying.xue, maloy, tipc-discussion

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Mon, 11 Jul 2016 16:08:34 -0400

> Fixes for some broadcast link problems that may occur in large systems.

Series applied, thanks Jon.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-12  5:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-11 20:08 [PATCH net 0/3] tipc: three small fixes Jon Maloy
2016-07-11 20:08 ` [PATCH net 1/3] tipc: extend broadcast link initialization criteria Jon Maloy
2016-07-11 20:08 ` [PATCH net 2/3] tipc: ensure correct broadcast send buffer release when peer is lost Jon Maloy
2016-07-11 20:08 ` [PATCH net 3/3] tipc: reset all unicast links when broadcast send link fails Jon Maloy
2016-07-12  5:43 ` [PATCH net 0/3] tipc: three small fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).