[PATCH] sctp: regression bug fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] sctp: regression bug fixes
@ 2009-01-22 22:36 Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes Vlad Yasevich
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-22 22:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem

David

This is a set of bug fixes that address various regressions reported to
the sctp list as well and to me privately.

Here is the list:
    1) SCTP checksum produced wrong result on big endian arch.
    2) Sometimes SCTP connection would hang withtout transmitting anything.
       Sometimes it would fail to retrnasmit after the second loss.
    3) The accept code raced with input processing resulting in locking the
       wrong socket.

Please apply and push to stable.

Thank you
-vlad

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
@ 2009-01-22 22:36 ` Vlad Yasevich
  2009-01-23  5:19   ` Herbert Xu
  2009-01-22 22:36 ` [PATCH 2/4] sctp: Correctly start rtx timer on new packet transmissions Vlad Yasevich
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-22 22:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

crc32c algorithm provides a byteswaped result.  On little-endian
arches, the result ends up in big-endian/network byte order.
On big-endinan arches, the result ends up in little-endian
order and needs to be byte swapped again.  Thus calling cpu_to_le32
gives the right output.

Tested-by: Jukka Taimisto <jukka.taimisto@mail.suomi.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 include/net/sctp/checksum.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/sctp/checksum.h b/include/net/sctp/checksum.h
index b799fb2..2fec3c3 100644
--- a/include/net/sctp/checksum.h
+++ b/include/net/sctp/checksum.h
@@ -79,5 +79,5 @@ static inline __be32 sctp_update_cksum(__u8 *buffer, __u16 length, __be32 crc32)
 
 static inline __be32 sctp_end_cksum(__be32 crc32)
 {
-	return ~crc32;
+	return (__force __be32)~cpu_to_le32((__force u32)crc32);
 }
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] sctp: Correctly start rtx timer on new packet transmissions.
  2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes Vlad Yasevich
@ 2009-01-22 22:36 ` Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 3/4] sctp: Properly timestamp outgoing data chunks for rtx purposes Vlad Yasevich
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-22 22:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

Commit 62aeaff5ccd96462b7077046357a6d7886175a57
(sctp: Start T3-RTX timer when fast retransmitting lowest TSN)
introduced a regression where it was possible to forcibly
restart the sctp retransmit timer at the transmission of any
new chunk.  This resulted in much longer timeout times and
sometimes hung sctp connections.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 net/sctp/outqueue.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 247ebc9..bc411c8 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -929,7 +929,6 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
 		}
 
 		/* Finally, transmit new packets.  */
-		start_timer = 0;
 		while ((chunk = sctp_outq_dequeue_data(q)) != NULL) {
 			/* RFC 2960 6.5 Every DATA chunk MUST carry a valid
 			 * stream identifier.
@@ -1028,7 +1027,7 @@ static int sctp_outq_flush(struct sctp_outq *q, int rtx_timeout)
 			list_add_tail(&chunk->transmitted_list,
 				      &transport->transmitted);
 
-			sctp_transport_reset_timers(transport, start_timer-1);
+			sctp_transport_reset_timers(transport, 0);
 
 			q->empty = 0;
 
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] sctp: Properly timestamp outgoing data chunks for rtx purposes
  2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 2/4] sctp: Correctly start rtx timer on new packet transmissions Vlad Yasevich
@ 2009-01-22 22:36 ` Vlad Yasevich
  2009-01-22 22:36 ` [PATCH 4/4] sctp: Fix another socket race during accept/peeloff Vlad Yasevich
  2009-01-22 22:53 ` [PATCH] sctp: regression bug fixes David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-22 22:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

Recent changes to the retransmit code exposed a long standing
bug where it was possible for a chunk to be time stamped
after the retransmit timer was reset.  This caused a rare
situation where the retrnamist timer has expired, but
nothing was marked for retrnasmission because all of
timesamps on data were less then 1 rto ago.  As result,
the timer was never restarted since nothing was retransmitted,
and this resulted in a hung association that did couldn't
complete the data transfer.  The solution is to timestamp
the chunk when it's added to the packet for transmission
purposes.  After the packet is trsnmitted the rtx timer
is restarted.  This guarantees that when the timer expires,
there will be data to retransmit.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 net/sctp/output.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index c3f417f..7363935 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -324,14 +324,16 @@ append:
 	switch (chunk->chunk_hdr->type) {
 	    case SCTP_CID_DATA:
 		retval = sctp_packet_append_data(packet, chunk);
+		if (SCTP_XMIT_OK != retval)
+			goto finish;
 		/* Disallow SACK bundling after DATA. */
 		packet->has_sack = 1;
 		/* Disallow AUTH bundling after DATA */
 		packet->has_auth = 1;
 		/* Let it be knows that packet has DATA in it */
 		packet->has_data = 1;
-		if (SCTP_XMIT_OK != retval)
-			goto finish;
+		/* timestamp the chunk for rtx purposes */
+		chunk->sent_at = jiffies;
 		break;
 	    case SCTP_CID_COOKIE_ECHO:
 		packet->has_cookie_echo = 1;
@@ -470,7 +472,6 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 			} else
 				chunk->resent = 1;
 
-			chunk->sent_at = jiffies;
 			has_data = 1;
 		}
 
-- 
1.5.3.5


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] sctp: Fix another socket race during accept/peeloff
  2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
                   ` (2 preceding siblings ...)
  2009-01-22 22:36 ` [PATCH 3/4] sctp: Properly timestamp outgoing data chunks for rtx purposes Vlad Yasevich
@ 2009-01-22 22:36 ` Vlad Yasevich
  2009-01-22 22:53 ` [PATCH] sctp: regression bug fixes David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-22 22:36 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, davem, Vlad Yasevich

There is a race between sctp_rcv() and sctp_accept() where we
have moved the association from the listening socket to the
accepted socket, but sctp_rcv() processing cached the old
socket and continues to use it.

The easy solution is to check for the socket mismatch once we've
grabed the socket lock.  If we hit a mis-match, that means
that were are currently holding the lock on the listening socket,
but the association is refrencing a newly accepted socket.  We need
to drop the lock on the old socket and grab the lock on the new one.

A more proper solution might be to create accepted sockets when
the new association is established, similar to TCP.  That would
eliminate the race for 1-to-1 style sockets, but it would still
existing for 1-to-many sockets where a user wished to peeloff an
association.  For now, we'll live with this easy solution as
it addresses the problem.

Reported-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
 net/sctp/input.c |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index bf612d9..2e4a864 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -249,6 +249,19 @@ int sctp_rcv(struct sk_buff *skb)
 	 */
 	sctp_bh_lock_sock(sk);

+	if (sk != rcvr->sk) {
+		/* Our cached sk is different from the rcvr->sk.  This is
+		 * because migrate()/accept() may have moved the association
+		 * to a new socket and released all the sockets.  So now we
+		 * are holding a lock on the old socket while the user may
+		 * be doing something with the new socket.  Switch our veiw
+		 * of the current sk.
+		 */
+		sctp_bh_unlock_sock(sk);
+		sk = rcvr->sk;
+		sctp_bh_lock_sock(sk);
+	}
+
 	if (sock_owned_by_user(sk)) {
 		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG);
 		sctp_add_backlog(sk, skb);
-- 
1.5.3.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] sctp: regression bug fixes
  2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
                   ` (3 preceding siblings ...)
  2009-01-22 22:36 ` [PATCH 4/4] sctp: Fix another socket race during accept/peeloff Vlad Yasevich
@ 2009-01-22 22:53 ` David Miller
  4 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2009-01-22 22:53 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: netdev, linux-sctp

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Thu, 22 Jan 2009 17:36:04 -0500

> This is a set of bug fixes that address various regressions reported to
> the sctp list as well and to me privately.
> 
> Here is the list:
>     1) SCTP checksum produced wrong result on big endian arch.
>     2) Sometimes SCTP connection would hang withtout transmitting anything.
>        Sometimes it would fail to retrnasmit after the second loss.
>     3) The accept code raced with input processing resulting in locking the
>        wrong socket.
> 
> Please apply and push to stable.

All applied and queued up for -stable, thanks Vlad.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-22 22:36 ` [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes Vlad Yasevich
@ 2009-01-23  5:19   ` Herbert Xu
  2009-01-23  5:28     ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2009-01-23  5:19 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, linux-sctp, davem, vladislav.yasevich

Vlad Yasevich <vladislav.yasevich@hp.com> wrote:
>
> diff --git a/include/net/sctp/checksum.h b/include/net/sctp/checksum.h
> index b799fb2..2fec3c3 100644
> --- a/include/net/sctp/checksum.h
> +++ b/include/net/sctp/checksum.h
> @@ -79,5 +79,5 @@ static inline __be32 sctp_update_cksum(__u8 *buffer, __u16 length, __be32 crc32)
> 
> static inline __be32 sctp_end_cksum(__be32 crc32)
> {
> -       return ~crc32;
> +       return (__force __be32)~cpu_to_le32((__force u32)crc32);
> }

Ouch, surely there is a better way to do this?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-23  5:19   ` Herbert Xu
@ 2009-01-23  5:28     ` Herbert Xu
  2009-01-23  9:24       ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2009-01-23  5:28 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, linux-sctp, davem

On Fri, Jan 23, 2009 at 04:19:56PM +1100, Herbert Xu wrote:
> Vlad Yasevich <vladislav.yasevich@hp.com> wrote:
> >
> > diff --git a/include/net/sctp/checksum.h b/include/net/sctp/checksum.h
> > index b799fb2..2fec3c3 100644
> > --- a/include/net/sctp/checksum.h
> > +++ b/include/net/sctp/checksum.h
> > @@ -79,5 +79,5 @@ static inline __be32 sctp_update_cksum(__u8 *buffer, __u16 length, __be32 crc32)
> > 
> > static inline __be32 sctp_end_cksum(__be32 crc32)
> > {
> > -       return ~crc32;
> > +       return (__force __be32)~cpu_to_le32((__force u32)crc32);
> > }
> 
> Ouch, surely there is a better way to do this?

In fact this looks wrong.  Has this code actually been tested
on big-endian?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-23  5:28     ` Herbert Xu
@ 2009-01-23  9:24       ` Herbert Xu
  2009-01-23 14:52         ` Vlad Yasevich
  0 siblings, 1 reply; 11+ messages in thread
From: Herbert Xu @ 2009-01-23  9:24 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, linux-sctp, davem

On Fri, Jan 23, 2009 at 04:28:49PM +1100, Herbert Xu wrote:
>
> > > static inline __be32 sctp_end_cksum(__be32 crc32)
> > > {
> > > -       return ~crc32;
> > > +       return (__force __be32)~cpu_to_le32((__force u32)crc32);
> > > }
> > 
> > Ouch, surely there is a better way to do this?
> 
> In fact this looks wrong.  Has this code actually been tested
> on big-endian?

Reading this again it does seem to do the right thing as it's
using the raw crc32c interface as opposed to crypto crc32c.

However, I suggest that we change it as follows:

1) Make sh->csum __le32 since we're using crc32c_le.
2) Change all intermediate values in sctp/checksum.h to u32.
3) Make sctp_end_cksum return __le32 and have it do

	return cpu_to_le32(~crc);

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-23  9:24       ` Herbert Xu
@ 2009-01-23 14:52         ` Vlad Yasevich
  2009-01-23 22:17           ` Herbert Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Vlad Yasevich @ 2009-01-23 14:52 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, linux-sctp, davem

Herbert Xu wrote:
> On Fri, Jan 23, 2009 at 04:28:49PM +1100, Herbert Xu wrote:
>>>> static inline __be32 sctp_end_cksum(__be32 crc32)
>>>> {
>>>> -       return ~crc32;
>>>> +       return (__force __be32)~cpu_to_le32((__force u32)crc32);
>>>> }
>>> Ouch, surely there is a better way to do this?
>> In fact this looks wrong.  Has this code actually been tested
>> on big-endian?

Yes, the code was tested by me and the person who reported it
as listed in the commit log. :)

> 
> Reading this again it does seem to do the right thing as it's
> using the raw crc32c interface as opposed to crypto crc32c.
> 
> However, I suggest that we change it as follows:
> 
> 1) Make sh->csum __le32 since we're using crc32c_le.
> 2) Change all intermediate values in sctp/checksum.h to u32.
> 3) Make sctp_end_cksum return __le32 and have it do
> 
> 	return cpu_to_le32(~crc);
> 

I'll give it some thought.  It would clean up all the __force
casts, but it will be a little misleading to define a packet
checksum as little endian. :)

-vlad

> Cheers,

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes.
  2009-01-23 14:52         ` Vlad Yasevich
@ 2009-01-23 22:17           ` Herbert Xu
  0 siblings, 0 replies; 11+ messages in thread
From: Herbert Xu @ 2009-01-23 22:17 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, linux-sctp, davem

On Fri, Jan 23, 2009 at 09:52:44AM -0500, Vlad Yasevich wrote:
>
> I'll give it some thought.  It would clean up all the __force
> casts, but it will be a little misleading to define a packet
> checksum as little endian. :)

Not at all! Just because something is on the network doesn't
mean that it's big-endian.  In this case, we're using the little-
endian CRC32C implementation for the checksum so it makes perfect
sense for it to be marked as __le32.

Network-order just has to be fixed in endian, i.e., either big
or small, you just need to stick with the one you choose regardless
of the endianness of the host.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-01-23 22:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-22 22:36 [PATCH] sctp: regression bug fixes Vlad Yasevich
2009-01-22 22:36 ` [PATCH 1/4] sctp: Fix crc32c calculations on big-endian arhes Vlad Yasevich
2009-01-23  5:19   ` Herbert Xu
2009-01-23  5:28     ` Herbert Xu
2009-01-23  9:24       ` Herbert Xu
2009-01-23 14:52         ` Vlad Yasevich
2009-01-23 22:17           ` Herbert Xu
2009-01-22 22:36 ` [PATCH 2/4] sctp: Correctly start rtx timer on new packet transmissions Vlad Yasevich
2009-01-22 22:36 ` [PATCH 3/4] sctp: Properly timestamp outgoing data chunks for rtx purposes Vlad Yasevich
2009-01-22 22:36 ` [PATCH 4/4] sctp: Fix another socket race during accept/peeloff Vlad Yasevich
2009-01-22 22:53 ` [PATCH] sctp: regression bug fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).