All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible SCTP peer receive window bug
@ 2012-11-26 13:31 Jamie Parsons
  2012-11-26 15:28 ` Neil Horman
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: Jamie Parsons @ 2012-11-26 13:31 UTC (permalink / raw)
  To: linux-sctp

Hi,

My name is Jamie Parsons.  I am working on a test tool that uses lksctp (lksctp-tools.x86_64 on a Linux box with a 2.6.32-279.9.1.el6.x86_64 kernel) to drive the SCTP interface on one of our products and I think I may have found a bug with the peer receive window size.

Having looked at this kernel maintainers list (http://lxr.linux.no/#linux+v3.6.7/MAINTAINERS) I believe that you are the people I should contact to report a bug.  If not, please let me know who I should be talking to instead.

If you are the correct people, can you please look at the detailed description below?  I think that the issue may be some problem to do with data structures not being reinitialized correctly after receiving an unexpected INIT.  

I've had a quick look at recent check ins for the kernel and couldn't see anything which was obviously a fix for this bug.  Would you be able to help debugging/fixing this issue?  I'm happy to repro it to get any diagnostics required.

Thanks for your help,

Jamie 

=====================

__TEST SETUP__
I've set up an SCTP connection between a Linux box and a fault tolerant peer.  I collect wireshark snoop from the Linux box throughout the test and periodically poll the Linux kernel for SCTP_STATUS using getsockopt() .

After letting it run cleanly for a few minutes, I then deliberately induce a fault on the peer to make it failover.  The peer then restarts the connection by sending an INIT to the Linux box (as covered by section 5.2.2 of RFC 4960).

__SYMPTOMS__
Initially, the peer is advertising a receive window of 2000 (I check this by looking at sctp.sack_a_rwnd in wireshark).  I can check that the Linux SCTP agrees with this value by doing a getsockopt for SCTP_STATUS and checking the value of sstat_rwnd.  At this stage there is no problem, the SCTP stack reports a value of 2000 with a slight deviation if there is some unacked data outstanding.  All good so far!

After failing over, the wireshark trace still shows that the peer is advertising a receive window of 2000.  However, if I now check the peer receive window through the Linux SCTP stack as above, it reports a consistently lower value (of 916 in my last run) again with slight deviation if there are unacked packets.  

At this point I stop sending any data from the Linux box, and wait a couple of minutes to ensure that the send buffer is emptied and all packets have been acked.  The last SACK sent by the peer has a receive window value of 2000 but the SCTP stack is still reporting a value of 916 with no packets unacked.

The problem is compounded by the fact that the SCTP association now can't be brought down from the Linux side.  I have set SO_LINGER 'on' with a time of 0.  If I call shutdown(SCK, SHUT_RDWR) before a failover then I can see the Linux box send an ABORT in the wireshark trace to tear down the association.

If I call shutdown(SCK, SHUT_RDWR) after the peer has failed over then no ABORT message is sent.  Using SCTP_STATUS on getsockopt I can see that the stack is in state 5 (PENDING_SHUTDOWN) and stays there indefinitely, which means it is waiting for packets to be acked.  This is despite the fact that it reports a value of 0 for unacked packets.



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-01-17 17:43 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-26 13:31 Possible SCTP peer receive window bug Jamie Parsons
2012-11-26 15:28 ` Neil Horman
2012-11-26 17:27 ` Jamie Parsons
2012-11-26 20:10 ` Neil Horman
2012-11-27 11:05 ` Jamie Parsons
2012-11-27 14:38 ` Neil Horman
2012-11-27 14:42 ` Jamie Parsons
2012-11-28 15:28 ` Neil Horman
2012-11-28 15:50 ` Vlad Yasevich
2012-11-28 20:55 ` Neil Horman
2012-11-28 21:25 ` Vlad Yasevich
2012-11-29  9:14 ` Jamie Parsons
2012-11-29  9:17 ` Jamie Parsons
2012-11-29 14:48 ` Neil Horman
2012-11-29 14:58 ` Neil Horman
2012-12-04 13:34 ` Jamie Parsons
2012-12-04 14:58 ` Neil Horman
2012-12-05 16:30 ` Neil Horman
2012-12-05 17:11 ` Vlad Yasevich
2012-12-06 14:03 ` Neil Horman
2012-12-06 15:42 ` Jamie Parsons
2012-12-06 19:14 ` Neil Horman
2012-12-06 21:39 ` Frank Ch. Eigler
2012-12-17 11:08 ` Jamie Parsons
2012-12-17 14:13 ` Neil Horman
2012-12-17 15:12 ` Vlad Yasevich
2012-12-20 12:17 ` Jamie Parsons
2013-01-16 16:58 ` Jamie Parsons
2013-01-16 21:11 ` Neil Horman
2013-01-17 16:45 ` Jamie Parsons
2013-01-17 17:43 ` Neil Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.