From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Morton <akpm@osdl.org>
Subject: Fw: [Bugme-new] [Bug 1675] New: TCP occasionally ignores a FIN,
 requiring a retransmit
Date: Mon, 15 Dec 2003 02:34:23 -0800
Sender: netdev-bounce@oss.sgi.com
Message-ID: <20031215023423.3d1ce730.akpm@osdl.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: kieranm@gtemail.net
Return-path: <netdev-bounce@oss.sgi.com>
To: netdev@oss.sgi.com
Errors-to: netdev-bounce@oss.sgi.com
List-Id: netdev.vger.kernel.org



Begin forwarded message:

Date: Fri, 12 Dec 2003 09:18:33 -0800
From: bugme-daemon@osdl.org
To: bugme-new@lists.osdl.org
Subject: [Bugme-new] [Bug 1675] New: TCP occasionally ignores a FIN, requiring a retransmit


http://bugme.osdl.org/show_bug.cgi?id=1675

           Summary: TCP occasionally ignores a FIN, requiring a retransmit
    Kernel Version: 2.4.20-18.9, 2.4.22
            Status: NEW
          Severity: low
             Owner: niv@us.ibm.com
         Submitter: kieranm@gtemail.net


Distribution: Redhat 9
Hardware Environment: Dual Intel Xeon Server, with Intel Corp. 82546EB Gigabit Ethernet 
Controller 
Software Environment: Linux, symptom provoked using small NetPIPE test
Problem Description:

If a FIN is delivered to the Linux TCP stack close (within around 10us) to the time it 
is sending a FIN|ACK, it does not acknowledge the received FIN.  The other node is then 
required to retransmit the FIN, which is then correctly acknowledged. 

I can supply an ethereal trace to illustrate this.

I suspect it is a race in the state change of the TCP connection, although I can't see 
an obvious candidate for it in the source - it seems to correctly implement the TCP 
state diagram.

Because it seems to only involve FINs, and the stack resolves it with a retransmission, 
this problem would not normally be visible.  As a result this is an annoyance rather 
than a serious problem, but having spent a week convincing myself that it is a bug in 
Linux, I'm quite interested to find what's wrong.

I have tested it on 2.4.20 (with hyperthread enabled) and 2.4.22 (with hyperthreading 
disabled) kernels.  Both are SMP kernels, which could be the cause of the race.

Steps to reproduce:
I am provoking the symptom using a different TCP stack on the remote node, which is 
itself running on a proprietary high performance network, bridged onto the ethernet that 
the linux node is on.  Because the problem is so sensitive to the timing of the delivery 
of the FIN packet it will be hard to reproduce on another setup.  The traffic is 
generated by a simple NetPIPE test:
NPtcp -p 0 -i -l 64 -n 10 -u 64 -h <hostname>

I'm happy to try any fixes that are suggested on this setup to see if it does resolve 
the problem.  Also, let me know any questions you have to help track it down.  The 
ethereal trace is the best way to see what the problem is.

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.