From mboxrd@z Thu Jan 1 00:00:00 1970 From: Venkat Venkatsubra Subject: oops in tcp_xmit_retransmit_queue Date: Wed, 22 Jan 2014 09:32:49 -0800 (PST) Message-ID: <8d17033a-2ebb-4e8e-9a38-c80738286d2e@default> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Cc: davem@davemloft.net To: netdev@vger.kernel.org Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:50811 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752538AbaAVRcx convert rfc822-to-8bit (ORCPT ); Wed, 22 Jan 2014 12:32:53 -0500 Sender: netdev-owner@vger.kernel.org List-ID: We hit this crash in tcp_xmit_retransmit_queue. @ BUG: unable to handle kernel NULL pointer dereference at (null) @ IP: [] tcp_xmit_retransmit_queue+0x21e/0x25d @ . @ Call Trace: @ @ [] tcp_ack+0x1662/0x168d @ [] ? tcp_init_tso_segs+0x3a/0x51 @ [] ? tcp_validate_incoming+0x69/0x296 @ [] tcp_rcv_established+0x4db/0x566 @ [] tcp_v4_do_rcv+0x196/0x352 @ [] ? local_bh_enable+0x12/0x14 @ [] tcp_v4_rcv+0x459/0x6d0 @ [] ? test_tsk_thread_flag+0x12/0x14 @ [] ? select_idle_sibling+0x3a/0xe7 @ [] ip_local_deliver_finish+0x152/0x1fa @ [] ip_local_deliver+0x72/0x7d @ [] ip_rcv_finish+0x372/0x38c @ [] ? tcp_gro_receive+0x7e/0x1e5 @ [] ip_rcv+0x2a2/0x2e1 @ [] __netif_receive_skb+0x41b/0x440 @ [] netif_receive_skb+0x49/0x50 @ [] napi_skb_finish+0x2b/0x42 @ [] napi_gro_receive+0x2f/0x34 @ [] igb_poll+0x808/0xb78 [igb] @ [] ? __enqueue_entity+0x79/0x7b @ [] net_rx_action+0xc6/0x1cd @ [] __do_softirq+0xd7/0x19e @ [] ? handle_IRQ_event+0x10a/0x120 @ [] call_softirq+0x1c/0x30 @ [] do_softirq+0x46/0x89 @ [] irq_exit+0x3b/0x7a @ [] do_IRQ+0x99/0xb0 @ [] ret_from_intr+0x0/0x11 @ @ [] ? mwait_idle+0x74/0x7f @ [] ? mwait_idle+0x67/0x7f @ [] ? cpu_idle+0xa5/0xd4 @ [] ? start_secondary+0x1fd/0x23c @ . @ RIP [] tcp_xmit_retransmit_queue+0x21e/0x25d tp->retransmit_skb_hint is non-NULL. retransmit_skb_hint->next is NULL. It crashes while walking through this list: tcp_for_write_queue_from(skb, sk) { __u8 sacked = TCP_SKB_CB(skb)->sacked; retransmit_skb_hint is pointing to a seq# range that is quite before tp->snd_una. Both "seq" and "end_seq" of tcp_skb_cb of retransmit_skb_hint are before tp->snd_una. Looks like tp->retransmit_skb_hint is either not unset in some path or gets set when it should not be. Some more info of the Customer's environment: -sack is enabled -this occurred on 2.6.32-400.1.1.el5uek which is based on linux-2.6.32 -is not re-creatable Is this a known problem that has been fixed after 2.6.32 ? Thanks. Venkat