[PATCH 0/6] [SCTP] Bug fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] [SCTP] Bug fixes
@ 2008-06-04 18:25 Vlad Yasevich
  2008-06-04 18:25 ` [PATCH 1/6] SCTP: retran_path update bug fix Vlad Yasevich
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem


The following is a series of patches that correct the following bugs:
  1.  Correctly pick the a retransmission path when we have more the one
      path, but only 1 is active.

  2.  Fix multiple problems with fast retransmit.  These include: not
      restrating retransmission timers, incorrectly decrementing congestion
      widow, and sending multiple packets during fast retransmit.

  3.  Fix ECN markings when sending over IPv6.  SCTP be default marks
      that it's ECN capable.  I found that it has changed not too long
      ago.

There was also a small cleanup.

Please apply.

Thanks
-vlad

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] SCTP: retran_path update bug fix
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:37   ` David Miller
  2008-06-04 18:25 ` [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop Vlad Yasevich
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Gui Jianfeng, Vlad Yasevich

From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>

If the current retran_path is the only active one, it should
update it to the the next inactive one.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 net/sctp/associola.c |   21 ++++++++++++---------
 1 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/net/sctp/associola.c b/net/sctp/associola.c
index b4cd2b7..5326348 100644
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -1203,6 +1203,9 @@ void sctp_assoc_update_retran_path(struct sctp_association *asoc)
 	struct list_head *head = &asoc->peer.transport_addr_list;
 	struct list_head *pos;
 
+	if (asoc->peer.transport_count == 1)
+		return;
+
 	/* Find the next transport in a round-robin fashion. */
 	t = asoc->peer.retran_path;
 	pos = &t->transports;
@@ -1217,6 +1220,15 @@ void sctp_assoc_update_retran_path(struct sctp_association *asoc)
 
 		t = list_entry(pos, struct sctp_transport, transports);
 
+		/* We have exhausted the list, but didn't find any
+		 * other active transports.  If so, use the next
+		 * transport.
+		 */
+		if (t == asoc->peer.retran_path) {
+			t = next;
+			break;
+		}
+
 		/* Try to find an active transport. */
 
 		if ((t->state == SCTP_ACTIVE) ||
@@ -1229,15 +1241,6 @@ void sctp_assoc_update_retran_path(struct sctp_association *asoc)
 			if (!next)
 				next = t;
 		}
-
-		/* We have exhausted the list, but didn't find any
-		 * other active transports.  If so, use the next
-		 * transport.
-		 */
-		if (t == asoc->peer.retran_path) {
-			t = next;
-			break;
-		}
 	}
 
 	asoc->peer.retran_path = t;
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
  2008-06-04 18:25 ` [PATCH 1/6] SCTP: retran_path update bug fix Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:38   ` David Miller
  2008-06-04 18:25 ` [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations Vlad Yasevich
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Gui Jianfeng, Vlad Yasevich

From: Gui Jianfeng <guijianfeng@cn.fujitsu.com>

There's no need to execute sctp_v4_dst_saddr() for each
iteration, just move it out of loop.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 net/sctp/protocol.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 0ec234b..040e489 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -470,11 +470,11 @@ static struct dst_entry *sctp_v4_get_dst(struct sctp_association *asoc,
 		/* Walk through the bind address list and look for a bind
 		 * address that matches the source address of the returned dst.
 		 */
+		sctp_v4_dst_saddr(&dst_saddr, dst, htons(bp->port));
 		rcu_read_lock();
 		list_for_each_entry_rcu(laddr, &bp->address_list, list) {
 			if (!laddr->valid || (laddr->state != SCTP_ADDR_SRC))
 				continue;
-			sctp_v4_dst_saddr(&dst_saddr, dst, htons(bp->port));
 			if (sctp_v4_cmp_addr(&dst_saddr, &laddr->a))
 				goto out_unlock;
 		}
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations.
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
  2008-06-04 18:25 ` [PATCH 1/6] SCTP: retran_path update bug fix Vlad Yasevich
  2008-06-04 18:25 ` [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:38   ` David Miller
  2008-06-04 18:25 ` [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN Vlad Yasevich
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
---
 include/net/sctp/structs.h |    8 +++++++-
 net/sctp/transport.c       |   44 ++++++++++++++++++++++++++++++++------------
 2 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 0ce0443..4f7b587 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -901,7 +901,10 @@ struct sctp_transport {
 	 *		calculation completes (i.e. the DATA chunk
 	 *		is SACK'd) clear this flag.
 	 */
-	int rto_pending;
+	__u8 rto_pending;
+
+	/* Flag to track the current fast recovery state */
+	__u8 fast_recovery;
 
 	/*
 	 * These are the congestion stats.
@@ -920,6 +923,9 @@ struct sctp_transport {
 	/* Data that has been sent, but not acknowledged. */
 	__u32 flight_size;
 
+	/* TSN marking the fast recovery exit point */
+	__u32 fast_recovery_exit;
+
 	/* Destination */
 	struct dst_entry *dst;
 	/* Source address. */
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index f4938f6..3759027 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -79,6 +79,7 @@ static struct sctp_transport *sctp_transport_init(struct sctp_transport *peer,
 	peer->rttvar = 0;
 	peer->srtt = 0;
 	peer->rto_pending = 0;
+	peer->fast_recovery = 0;
 
 	peer->last_time_heard = jiffies;
 	peer->last_time_used = jiffies;
@@ -403,11 +404,16 @@ void sctp_transport_raise_cwnd(struct sctp_transport *transport,
 	cwnd = transport->cwnd;
 	flight_size = transport->flight_size;
 
+	/* See if we need to exit Fast Recovery first */
+	if (transport->fast_recovery &&
+	    TSN_lte(transport->fast_recovery_exit, sack_ctsn))
+		transport->fast_recovery = 0;
+
 	/* The appropriate cwnd increase algorithm is performed if, and only
-	 * if the cumulative TSN has advanced and the congestion window is
+	 * if the cumulative TSN whould advanced and the congestion window is
 	 * being fully utilized.
 	 */
-	if ((transport->asoc->ctsn_ack_point >= sack_ctsn) ||
+	if (TSN_lte(sack_ctsn, transport->asoc->ctsn_ack_point) ||
 	    (flight_size < cwnd))
 		return;
 
@@ -416,17 +422,23 @@ void sctp_transport_raise_cwnd(struct sctp_transport *transport,
 	pmtu = transport->asoc->pathmtu;
 
 	if (cwnd <= ssthresh) {
-		/* RFC 2960 7.2.1, sctpimpguide-05 2.14.2 When cwnd is less
-		 * than or equal to ssthresh an SCTP endpoint MUST use the
-		 * slow start algorithm to increase cwnd only if the current
-		 * congestion window is being fully utilized and an incoming
-		 * SACK advances the Cumulative TSN Ack Point. Only when these
-		 * two conditions are met can the cwnd be increased otherwise
-		 * the cwnd MUST not be increased. If these conditions are met
-		 * then cwnd MUST be increased by at most the lesser of
-		 * 1) the total size of the previously outstanding DATA
-		 * chunk(s) acknowledged, and 2) the destination's path MTU.
+		/* RFC 4960 7.2.1
+		 * o  When cwnd is less than or equal to ssthresh, an SCTP
+		 *    endpoint MUST use the slow-start algorithm to increase
+		 *    cwnd only if the current congestion window is being fully
+		 *    utilized, an incoming SACK advances the Cumulative TSN
+		 *    Ack Point, and the data sender is not in Fast Recovery.
+		 *    Only when these three conditions are met can the cwnd be
+		 *    increased; otherwise, the cwnd MUST not be increased.
+		 *    If these conditions are met, then cwnd MUST be increased
+		 *    by, at most, the lesser of 1) the total size of the
+		 *    previously outstanding DATA chunk(s) acknowledged, and
+		 *    2) the destination's path MTU.  This upper bound protects
+		 *    against the ACK-Splitting attack outlined in [SAVAGE99].
 		 */
+		if (transport->fast_recovery)
+			return;
+
 		if (bytes_acked > pmtu)
 			cwnd += pmtu;
 		else
@@ -502,6 +514,13 @@ void sctp_transport_lower_cwnd(struct sctp_transport *transport,
 		 *      cwnd = ssthresh
 		 *      partial_bytes_acked = 0
 		 */
+		if (transport->fast_recovery)
+			return;
+
+		/* Mark Fast recovery */
+		transport->fast_recovery = 1;
+		transport->fast_recovery_exit = transport->asoc->next_tsn - 1;
+
 		transport->ssthresh = max(transport->cwnd/2,
 					  4*transport->asoc->pathmtu);
 		transport->cwnd = transport->ssthresh;
@@ -586,6 +605,7 @@ void sctp_transport_reset(struct sctp_transport *t)
 	t->flight_size = 0;
 	t->error_count = 0;
 	t->rto_pending = 0;
+	t->fast_recovery = 0;
 
 	/* Initialize the state information for SFR-CACC */
 	t->cacc.changeover_active = 0;
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
                   ` (2 preceding siblings ...)
  2008-06-04 18:25 ` [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:39   ` David Miller
  2008-06-04 18:25 ` [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit Vlad Yasevich
  2008-06-04 18:25 ` [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6 Vlad Yasevich
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
---
 include/net/sctp/structs.h |    5 ++++-
 net/sctp/outqueue.c        |   42 +++++++++++++++++++++++++++++++-----------
 net/sctp/transport.c       |    4 ++--
 3 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 4f7b587..1014c77 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1050,7 +1050,7 @@ void sctp_transport_route(struct sctp_transport *, union sctp_addr *,
 			  struct sctp_sock *);
 void sctp_transport_pmtu(struct sctp_transport *);
 void sctp_transport_free(struct sctp_transport *);
-void sctp_transport_reset_timers(struct sctp_transport *);
+void sctp_transport_reset_timers(struct sctp_transport *, int);
 void sctp_transport_hold(struct sctp_transport *);
 void sctp_transport_put(struct sctp_transport *);
 void sctp_transport_update_rto(struct sctp_transport *, __u32);
@@ -1140,6 +1140,9 @@ struct sctp_outq {
 	/* How many unackd bytes do we have in-flight?	*/
 	__u32 outstanding_bytes;
 
+	/* Are we doing fast-rtx on this queue */
+	char fast_rtx;
+
 	/* Corked? */
 	char cork;
 
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 59edfd2..5d3c441 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -208,6 +208,7 @@ void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
 	INIT_LIST_HEAD(&q->sacked);
 	INIT_LIST_HEAD(&q->abandoned);
 
+	q->fast_rtx = 0;
 	q->outstanding_bytes = 0;
 	q->empty = 1;
 	q->cork  = 0;
@@ -500,6 +501,7 @@ void sctp_retransmit(struct sctp_outq *q, struct sctp_transport *transport,
 	case SCTP_RTXR_FAST_RTX:
 		SCTP_INC_STATS(SCTP_MIB_FAST_RETRANSMITS);
 		sctp_transport_lower_cwnd(transport, SCTP_LOWER_CWND_FAST_RTX);
+		q->fast_rtx = 1;
 		break;
 	case SCTP_RTXR_PMTUD:
 		SCTP_INC_STATS(SCTP_MIB_PMTUD_RETRANSMITS);
@@ -543,10 +545,13 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 	sctp_xmit_t status;
 	struct sctp_chunk *chunk, *chunk1;
 	struct sctp_association *asoc;
+	int fast_rtx;
 	int error = 0;
+	int timer = 0;
 
 	asoc = q->asoc;
 	lqueue = &q->retransmit;
+	fast_rtx = q->fast_rtx;
 
 	/* RFC 2960 6.3.3 Handle T3-rtx Expiration
 	 *
@@ -587,13 +592,12 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		switch (status) {
 		case SCTP_XMIT_PMTU_FULL:
 			/* Send this packet.  */
-			if ((error = sctp_packet_transmit(pkt)) == 0)
-				*start_timer = 1;
+			error = sctp_packet_transmit(pkt);
 
 			/* If we are retransmitting, we should only
 			 * send a single packet.
 			 */
-			if (rtx_timeout) {
+			if (rtx_timeout || fast_rtx) {
 				list_add(lchunk, lqueue);
 				lchunk = NULL;
 			}
@@ -603,8 +607,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 
 		case SCTP_XMIT_RWND_FULL:
 			/* Send this packet. */
-			if ((error = sctp_packet_transmit(pkt)) == 0)
-				*start_timer = 1;
+			error = sctp_packet_transmit(pkt);
 
 			/* Stop sending DATA as there is no more room
 			 * at the receiver.
@@ -615,8 +618,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 
 		case SCTP_XMIT_NAGLE_DELAY:
 			/* Send this packet. */
-			if ((error = sctp_packet_transmit(pkt)) == 0)
-				*start_timer = 1;
+			error = sctp_packet_transmit(pkt);
 
 			/* Stop sending DATA because of nagle delay. */
 			list_add(lchunk, lqueue);
@@ -635,7 +637,14 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			if (chunk->fast_retransmit > 0)
 				chunk->fast_retransmit = -1;
 
-			*start_timer = 1;
+			/* Force start T3-rtx timer when fast retransmitting
+			 * the earliest outstanding TSN
+			 */
+			if (!timer && fast_rtx &&
+			    ntohl(chunk->subh.data_hdr->tsn) ==
+					     asoc->ctsn_ack_point + 1)
+				timer = 2;
+
 			q->empty = 0;
 
 			/* Retrieve a new chunk to bundle. */
@@ -643,12 +652,16 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			break;
 		}
 
+		/* Set the timer if there were no errors */
+		if (!error && !timer)
+			timer = 1;
+
 		/* If we are here due to a retransmit timeout or a fast
 		 * retransmit and if there are any chunks left in the retransmit
 		 * queue that could not fit in the PMTU sized packet, they need
 		 * to be marked as ineligible for a subsequent fast retransmit.
 		 */
-		if (rtx_timeout && !lchunk) {
+		if (rtx_timeout && fast_rtx) {
 			list_for_each_entry(chunk1, lqueue, transmitted_list) {
 				if (chunk1->fast_retransmit > 0)
 					chunk1->fast_retransmit = -1;
@@ -656,6 +669,12 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		}
 	}
 
+	*start_timer = timer;
+
+	/* Clear fast retransmit hint */
+	if (fast_rtx)
+		q->fast_rtx = 0;
+
 	return error;
 }
 
@@ -862,7 +881,8 @@ int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
 						    rtx_timeout, &start_timer);
 
 			if (start_timer)
-				sctp_transport_reset_timers(transport);
+				sctp_transport_reset_timers(transport,
+							    start_timer-1);
 
 			/* This can happen on COOKIE-ECHO resend.  Only
 			 * one chunk can get bundled with a COOKIE-ECHO.
@@ -977,7 +997,7 @@ int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
 			list_add_tail(&chunk->transmitted_list,
 				      &transport->transmitted);
 
-			sctp_transport_reset_timers(transport);
+			sctp_transport_reset_timers(transport, start_timer-1);
 
 			q->empty = 0;
 
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index 3759027..33d9201 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -191,7 +191,7 @@ static void sctp_transport_destroy(struct sctp_transport *transport)
 /* Start T3_rtx timer if it is not already running and update the heartbeat
  * timer.  This routine is called every time a DATA chunk is sent.
  */
-void sctp_transport_reset_timers(struct sctp_transport *transport)
+void sctp_transport_reset_timers(struct sctp_transport *transport, int force)
 {
 	/* RFC 2960 6.3.2 Retransmission Timer Rules
 	 *
@@ -201,7 +201,7 @@ void sctp_transport_reset_timers(struct sctp_transport *transport)
 	 * address.
 	 */
 
-	if (!timer_pending(&transport->T3_rtx_timer))
+	if (force || !timer_pending(&transport->T3_rtx_timer))
 		if (!mod_timer(&transport->T3_rtx_timer,
 			       jiffies + transport->rto))
 			sctp_transport_hold(transport);
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit.
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
                   ` (3 preceding siblings ...)
  2008-06-04 18:25 ` [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:39   ` David Miller
  2008-06-04 18:25 ` [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6 Vlad Yasevich
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

When fast retransmit is triggered by a sack, we should flush
the queue only once so that only 1 retransmit happens.  Also,
since we could potentially have non-fast-rtx chunks on
the retransmit queue, we need make sure any chunks eligable
for fast retransmit are sent first during fast retransmission.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
---
 net/sctp/outqueue.c |   82 ++++++++++++++++++++++++++++++---------------------
 1 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 5d3c441..ace6770 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -520,9 +520,15 @@ void sctp_retransmit(struct sctp_outq *q, struct sctp_transport *transport,
 	 * the sender SHOULD try to advance the "Advanced.Peer.Ack.Point" by
 	 * following the procedures outlined in C1 - C5.
 	 */
-	sctp_generate_fwdtsn(q, q->asoc->ctsn_ack_point);
+	if (reason == SCTP_RTXR_T3_RTX)
+		sctp_generate_fwdtsn(q, q->asoc->ctsn_ack_point);
 
-	error = sctp_outq_flush(q, /* rtx_timeout */ 1);
+	/* Flush the queues only on timeout, since fast_rtx is only
+	 * triggered during sack processing and the queue
+	 * will be flushed at the end.
+	 */
+	if (reason != SCTP_RTXR_FAST_RTX)
+		error = sctp_outq_flush(q, /* rtx_timeout */ 1);
 
 	if (error)
 		q->asoc->base.sk->sk_err = -error;
@@ -540,7 +546,6 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			       int rtx_timeout, int *start_timer)
 {
 	struct list_head *lqueue;
-	struct list_head *lchunk;
 	struct sctp_transport *transport = pkt->transport;
 	sctp_xmit_t status;
 	struct sctp_chunk *chunk, *chunk1;
@@ -548,12 +553,16 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 	int fast_rtx;
 	int error = 0;
 	int timer = 0;
+	int done = 0;
 
 	asoc = q->asoc;
 	lqueue = &q->retransmit;
 	fast_rtx = q->fast_rtx;
 
-	/* RFC 2960 6.3.3 Handle T3-rtx Expiration
+	/* This loop handles time-out retransmissions, fast retransmissions,
+	 * and retransmissions due to opening of whindow.
+	 *
+	 * RFC 2960 6.3.3 Handle T3-rtx Expiration
 	 *
 	 * E3) Determine how many of the earliest (i.e., lowest TSN)
 	 * outstanding DATA chunks for the address for which the
@@ -568,12 +577,12 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 	 * [Just to be painfully clear, if we are retransmitting
 	 * because a timeout just happened, we should send only ONE
 	 * packet of retransmitted data.]
+	 *
+	 * For fast retransmissions we also send only ONE packet.  However,
+	 * if we are just flushing the queue due to open window, we'll
+	 * try to send as much as possible.
 	 */
-	lchunk = sctp_list_dequeue(lqueue);
-
-	while (lchunk) {
-		chunk = list_entry(lchunk, struct sctp_chunk,
-				   transmitted_list);
+	list_for_each_entry_safe(chunk, chunk1, lqueue, transmitted_list) {
 
 		/* Make sure that Gap Acked TSNs are not retransmitted.  A
 		 * simple approach is just to move such TSNs out of the
@@ -581,11 +590,18 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		 * next chunk.
 		 */
 		if (chunk->tsn_gap_acked) {
-			list_add_tail(lchunk, &transport->transmitted);
-			lchunk = sctp_list_dequeue(lqueue);
+			list_del(&chunk->transmitted_list);
+			list_add_tail(&chunk->transmitted_list,
+					&transport->transmitted);
 			continue;
 		}
 
+		/* If we are doing fast retransmit, ignore non-fast_rtransmit
+		 * chunks
+		 */
+		if (fast_rtx && !chunk->fast_retransmit)
+			continue;
+
 		/* Attempt to append this chunk to the packet. */
 		status = sctp_packet_append_chunk(pkt, chunk);
 
@@ -597,12 +613,10 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			/* If we are retransmitting, we should only
 			 * send a single packet.
 			 */
-			if (rtx_timeout || fast_rtx) {
-				list_add(lchunk, lqueue);
-				lchunk = NULL;
-			}
+			if (rtx_timeout || fast_rtx)
+				done = 1;
 
-			/* Bundle lchunk in the next round.  */
+			/* Bundle next chunk in the next round.  */
 			break;
 
 		case SCTP_XMIT_RWND_FULL:
@@ -612,8 +626,7 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			/* Stop sending DATA as there is no more room
 			 * at the receiver.
 			 */
-			list_add(lchunk, lqueue);
-			lchunk = NULL;
+			done = 1;
 			break;
 
 		case SCTP_XMIT_NAGLE_DELAY:
@@ -621,15 +634,16 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 			error = sctp_packet_transmit(pkt);
 
 			/* Stop sending DATA because of nagle delay. */
-			list_add(lchunk, lqueue);
-			lchunk = NULL;
+			done = 1;
 			break;
 
 		default:
 			/* The append was successful, so add this chunk to
 			 * the transmitted list.
 			 */
-			list_add_tail(lchunk, &transport->transmitted);
+			list_del(&chunk->transmitted_list);
+			list_add_tail(&chunk->transmitted_list,
+					&transport->transmitted);
 
 			/* Mark the chunk as ineligible for fast retransmit
 			 * after it is retransmitted.
@@ -646,9 +660,6 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 				timer = 2;
 
 			q->empty = 0;
-
-			/* Retrieve a new chunk to bundle. */
-			lchunk = sctp_list_dequeue(lqueue);
 			break;
 		}
 
@@ -656,16 +667,19 @@ static int sctp_outq_flush_rtx(struct sctp_outq *q, struct sctp_packet *pkt,
 		if (!error && !timer)
 			timer = 1;
 
-		/* If we are here due to a retransmit timeout or a fast
-		 * retransmit and if there are any chunks left in the retransmit
-		 * queue that could not fit in the PMTU sized packet, they need
-		 * to be marked as ineligible for a subsequent fast retransmit.
-		 */
-		if (rtx_timeout && fast_rtx) {
-			list_for_each_entry(chunk1, lqueue, transmitted_list) {
-				if (chunk1->fast_retransmit > 0)
-					chunk1->fast_retransmit = -1;
-			}
+		if (done)
+			break;
+	}
+
+	/* If we are here due to a retransmit timeout or a fast
+	 * retransmit and if there are any chunks left in the retransmit
+	 * queue that could not fit in the PMTU sized packet, they need
+	 * to be marked as ineligible for a subsequent fast retransmit.
+	 */
+	if (rtx_timeout || fast_rtx) {
+		list_for_each_entry(chunk1, lqueue, transmitted_list) {
+			if (chunk1->fast_retransmit > 0)
+				chunk1->fast_retransmit = -1;
 		}
 	}
 
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6
  2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
                   ` (4 preceding siblings ...)
  2008-06-04 18:25 ` [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit Vlad Yasevich
@ 2008-06-04 18:25 ` Vlad Yasevich
  2008-06-04 19:40   ` David Miller
  5 siblings, 1 reply; 13+ messages in thread
From: Vlad Yasevich @ 2008-06-04 18:25 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

Commit e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c
([IPV6]: Use appropriate sock tclass setting for routing lookup.) also
changed the way that ECN capable transports mark this capability in IPv6.
As a result, SCTP was not marking ECN capablity because the traffic class
was never set.  This patch brings back the markings for IPv6 traffic.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 include/net/sctp/structs.h |    1 +
 net/sctp/ipv6.c            |    6 ++++++
 net/sctp/output.c          |    2 +-
 net/sctp/protocol.c        |    6 ++++++
 4 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 1014c77..2fb05b8 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -587,6 +587,7 @@ struct sctp_af {
 	int		(*is_ce)	(const struct sk_buff *sk);
 	void		(*seq_dump_addr)(struct seq_file *seq,
 					 union sctp_addr *addr);
+	void		(*ecn_capable)(struct sock *sk);
 	__u16		net_header_len;
 	int		sockaddr_len;
 	sa_family_t	sa_family;
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index e45e44c..6244367 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -726,6 +726,11 @@ static void sctp_v6_seq_dump_addr(struct seq_file *seq, union sctp_addr *addr)
 	seq_printf(seq, NIP6_FMT " ", NIP6(addr->v6.sin6_addr));
 }
 
+static void sctp_v6_ecn_capable(struct sock *sk)
+{
+	inet6_sk(sk)->tclass |= INET_ECN_ECT_0;
+}
+
 /* Initialize a PF_INET6 socket msg_name. */
 static void sctp_inet6_msgname(char *msgname, int *addr_len)
 {
@@ -996,6 +1001,7 @@ static struct sctp_af sctp_af_inet6 = {
 	.skb_iif	   = sctp_v6_skb_iif,
 	.is_ce		   = sctp_v6_is_ce,
 	.seq_dump_addr	   = sctp_v6_seq_dump_addr,
+	.ecn_capable	   = sctp_v6_ecn_capable,
 	.net_header_len	   = sizeof(struct ipv6hdr),
 	.sockaddr_len	   = sizeof(struct sockaddr_in6),
 #ifdef CONFIG_COMPAT
diff --git a/net/sctp/output.c b/net/sctp/output.c
index cf4f9fb..6d45bae 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -548,7 +548,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	 * Note: The works for IPv6 layer checks this bit too later
 	 * in transmission.  See IP6_ECN_flow_xmit().
 	 */
-	INET_ECN_xmit(nskb->sk);
+	(*tp->af_specific->ecn_capable)(nskb->sk);
 
 	/* Set up the IP options.  */
 	/* BUG: not implemented
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 040e489..481baf1 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -616,6 +616,11 @@ static void sctp_v4_seq_dump_addr(struct seq_file *seq, union sctp_addr *addr)
 	seq_printf(seq, "%d.%d.%d.%d ", NIPQUAD(addr->v4.sin_addr));
 }
 
+static void sctp_v4_ecn_capable(struct sock *sk)
+{
+	INET_ECN_xmit(sk);
+}
+
 /* Event handler for inet address addition/deletion events.
  * The sctp_local_addr_list needs to be protocted by a spin lock since
  * multiple notifiers (say IPv4 and IPv6) may be running at the same
@@ -934,6 +939,7 @@ static struct sctp_af sctp_af_inet = {
 	.skb_iif	   = sctp_v4_skb_iif,
 	.is_ce		   = sctp_v4_is_ce,
 	.seq_dump_addr	   = sctp_v4_seq_dump_addr,
+	.ecn_capable	   = sctp_v4_ecn_capable,
 	.net_header_len	   = sizeof(struct iphdr),
 	.sockaddr_len	   = sizeof(struct sockaddr_in),
 #ifdef CONFIG_COMPAT
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] SCTP: retran_path update bug fix
  2008-06-04 18:25 ` [PATCH 1/6] SCTP: retran_path update bug fix Vlad Yasevich
@ 2008-06-04 19:37   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:37 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp, guijianfeng

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:46 -0400

> If the current retran_path is the only active one, it should
> update it to the the next inactive one.
> 
> Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

Applied.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop
  2008-06-04 18:25 ` [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop Vlad Yasevich
@ 2008-06-04 19:38   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:38 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp, guijianfeng

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:47 -0400

> There's no need to execute sctp_v4_dst_saddr() for each
> iteration, just move it out of loop.
> 
> Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

Applied.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations.
  2008-06-04 18:25 ` [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations Vlad Yasevich
@ 2008-06-04 19:38   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:38 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:48 -0400

> Correctly keep track of Fast Recovery state and do not reduce
> congestion window multiple times during sucht state.
> 
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>

Applied.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN
  2008-06-04 18:25 ` [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN Vlad Yasevich
@ 2008-06-04 19:39   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:39 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:49 -0400

> When we are trying to fast retransmit the lowest outstanding TSN, we
> need to restart the T3-RTX timer, so that subsequent timeouts will
> correctly tag all the packets necessary for retransmissions.
> 
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>

Applied.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit.
  2008-06-04 18:25 ` [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit Vlad Yasevich
@ 2008-06-04 19:39   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:39 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:50 -0400

> When fast retransmit is triggered by a sack, we should flush
> the queue only once so that only 1 retransmit happens.  Also,
> since we could potentially have non-fast-rtx chunks on
> the retransmit queue, we need make sure any chunks eligable
> for fast retransmit are sent first during fast retransmission.
> 
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>

Applied.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6
  2008-06-04 18:25 ` [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6 Vlad Yasevich
@ 2008-06-04 19:40   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2008-06-04 19:40 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Wed,  4 Jun 2008 14:25:51 -0400

> Commit e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c
> ([IPV6]: Use appropriate sock tclass setting for routing lookup.) also
> changed the way that ECN capable transports mark this capability in IPv6.
> As a result, SCTP was not marking ECN capablity because the traffic class
> was never set.  This patch brings back the markings for IPv6 traffic.
> 
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

Also applied, thanks a lot Vlad!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-06-04 19:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-04 18:25 [PATCH 0/6] [SCTP] Bug fixes Vlad Yasevich
2008-06-04 18:25 ` [PATCH 1/6] SCTP: retran_path update bug fix Vlad Yasevich
2008-06-04 19:37   ` David Miller
2008-06-04 18:25 ` [PATCH 2/6] SCTP: Move sctp_v4_dst_saddr out of loop Vlad Yasevich
2008-06-04 19:38   ` David Miller
2008-06-04 18:25 ` [PATCH 3/6] SCTP: Correctly implement Fast Recovery cwnd manipulations Vlad Yasevich
2008-06-04 19:38   ` David Miller
2008-06-04 18:25 ` [PATCH 4/6] SCTP: Start T3-RTX timer when fast retransmitting lowest TSN Vlad Yasevich
2008-06-04 19:39   ` David Miller
2008-06-04 18:25 ` [PATCH 5/6] SCTP: Flush the queue only once during fast retransmit Vlad Yasevich
2008-06-04 19:39   ` David Miller
2008-06-04 18:25 ` [PATCH 6/6] [SCTP]: Fix ECN markings for IPv6 Vlad Yasevich
2008-06-04 19:40   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).