From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S267515AbUI1DNZ (ORCPT ); Mon, 27 Sep 2004 23:13:25 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S267517AbUI1DNZ (ORCPT ); Mon, 27 Sep 2004 23:13:25 -0400 Received: from rproxy.gmail.com ([64.233.170.195]:31560 "EHLO mproxy.gmail.com") by vger.kernel.org with ESMTP id S267515AbUI1DNU (ORCPT ); Mon, 27 Sep 2004 23:13:20 -0400 Message-ID: Date: Mon, 27 Sep 2004 23:13:20 -0400 From: Madhu Bandireddy Reply-To: Madhu Bandireddy To: linux-kernel@vger.kernel.org Subject: Help: Connection drops with kernel 2.4.21-15 over Broadcom NetXtreme BCM5703 gigabit cards Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org We are seeing a large number of connection drops on our SAP application servers with the following config: Linux: Redhat Enterprise 3.0 Kernel: 2.4.21-15 HW: Dell 2650 with Broadcom NetXtreme BCM5703 gigabit ethernet card CPU: 2x 2.8GHz Memory: 12GB Ethernet driver: Tigon 3.0 We could not find any errors on the interface itself. We are seeing most of these connections drops from sites which are connected to our data center over site-to-site VPN connections with high latencies. In addition to the Linux servers in our SAP application server set we also have 2 HP PA-RISC servers running HPUX 11.0 server. We do not see any such connection drop problems on the HPUX servers. This makes us believe that the problem is more related to the Linux environment. Another thing to note is that these connection drops seem to occur when there a large number of connections (about 500 or so) to each server. SAP seems to think this an OS problem. We do not see any error messages in any of the system logs or on the ethernet interfaces. The network itself does not show any errors. Here is the output of netstat -s: lnas4 Fri Sep 24 11:11:07 CDT 2004 Tcp: 27157 active connections openings 37507 passive connection openings 4 failed connection attempts 318 connection resets received 209 connections established 70267903 segments received 64033785 segments send out 49041 segments retransmited 395 bad segments received. 3269 resets sent TcpExt: 3 packets pruned from receive queue because of socket buffer overrun 16 ICMP packets dropped because socket was locked ArpFilter: 0 31311 TCP sockets finished time wait in fast timer 920380 delayed acks sent 206 delayed acks further delayed because of locked socket Quick ack mode was activated 19329 times 64619380 packets directly queued to recvmsg prequeue. 337477000 packets directly received from backlog 1199298288 packets directly received from prequeue 48 packets dropped from prequeue 1503618 packets header predicted 64689963 packets header predicted and directly queued to user TCPPureAcks: 740615 TCPHPAcks: 48292250 TCPRenoRecovery: 0 TCPSackRecovery: 675 TCPSACKReneging: 0 TCPFACKReorder: 5 TCPSACKReorder: 2 TCPRenoReorder: 0 TCPTSReorder: 0 TCPFullUndo: 0 TCPPartialUndo: 0 TCPDSACKUndo: 0 TCPLossUndo: 5 TCPLoss: 641 TCPLostRetransmit: 0 TCPRenoFailures: 0 TCPSackFailures: 2879 TCPLossFailures: 281 TCPFastRetrans: 804 TCPForwardRetrans: 11 TCPSlowStartRetrans: 30896 TCPTimeouts: 5143 TCPRenoRecoveryFail: 0 TCPSackRecoveryFail: 174 TCPSchedulerFailed: 1 TCPRcvCollapsed: 146 TCPDSACKOldSent: 3987 TCPDSACKOfoSent: 203 TCPDSACKRecv: 39 TCPDSACKOfoRecv: 0 TCPAbortOnSyn: 0 TCPAbortOnData: 10 TCPAbortOnClose: 1 TCPAbortOnMemory: 0 TCPAbortOnTimeout: 222 TCPAbortOnLinger: 0 TCPAbortFailed: 0 TCPMemoryPressures: 0 Since the problem seems to be with connections over site-to-site VPN and are generally more pronounced during periods of high loads. I believe this is more a tuning issue or a bug. Any help in resolving this problem is greatly appreciated. Thanks Madhu