* Panic at tcp_xmit_retransmit_queue
@ 2010-01-19 16:13 sbs
2010-01-19 19:36 ` sbs
2010-02-01 14:45 ` sbs
0 siblings, 2 replies; 7+ messages in thread
From: sbs @ 2010-01-19 16:13 UTC (permalink / raw)
To: netdev
We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
it appears usualy under a high network trafic ( around 10000Mbit/s) but
it is not a rule, it has happened even on low trafic.
Servers are used as nginx+static content
On 2 equal servers this panic happens aprox 2 times a day depending on
network load. Machine completly freezes till the netconsole reboots.
Kernel: 2.6.32.3
what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
can anyone explain or fix?
Panic output:
Dec 29 22:33:51 linuxtest [1188725.037019] BUG: unable to handle kernel
Dec 29 22:33:51 linuxtest NULL pointer dereference
Dec 29 22:33:51 linuxtest at (null)
Dec 29 22:33:51 linuxtest [1188725.037042] IP:
Dec 29 22:33:51 linuxtest [<c060164a>] tcp_xmit_retransmit_queue+0x1b2/0x1dc
Dec 29 22:33:51 linuxtest [1188725.037064] *pdpt = 00000000229c2001
Dec 29 22:33:51 linuxtest *pde = 0000000000000000
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.037080] Thread overran stack, or
stack corrupted
Dec 29 22:33:51 linuxtest [1188725.037091] Oops: 0000 [#1]
Dec 29 22:33:51 linuxtest SMP
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.037104] last sysfs file:
/sys/devices/pci0000:00/0000:00:0f.0/0000:07:00.0/0000:08:01.0/0000:09:00.0/class
Dec 29 22:33:51 linuxtest [1188725.037124]
Dec 29 22:33:51 linuxtest [1188725.037131] Pid: 0, comm: swapper Not
tainted (2.6.31.6-v03 #2) H8DMU
Dec 29 22:33:51 linuxtest [1188725.037145] EIP: 0060:[<c060164a>]
EFLAGS: 00010246 CPU: 0
Dec 29 22:33:51 linuxtest [1188725.037158] EIP is at
tcp_xmit_retransmit_queue+0x1b2/0x1dc
Dec 29 22:33:51 linuxtest [1188725.037170] EAX: c540513c EBX: c54050c0
ECX: 0e377f15 EDX: c540513c
Dec 29 22:33:51 linuxtest [1188725.037183] ESI: 00000000 EDI: 00000000
EBP: c0805d28 ESP: c0805d0c
Dec 29 22:33:51 linuxtest [1188725.037196] DS: 007b ES: 007b FS: 00d8
GS: 0000 SS: 0068
Dec 29 22:33:51 linuxtest [1188725.037208] Process swapper (pid: 0,
ti=c0804000 task=c080b5a0 task.ti=c0804000)
Dec 29 22:33:51 linuxtest [1188725.037285] Stack:
Dec 29 22:33:51 linuxtest [1188725.037368] 00000202
Dec 29 22:33:51 linuxtest 00000000
Dec 29 22:33:51 linuxtest c540513c
Dec 29 22:33:51 linuxtest 0e377f14
Dec 29 22:33:51 linuxtest 00000000
Dec 29 22:33:51 linuxtest c54050c0
Dec 29 22:33:51 linuxtest 0000050e
Dec 29 22:33:51 linuxtest c0805da8
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.037472] <0>
Dec 29 22:33:51 linuxtest c05fe931
Dec 29 22:33:51 linuxtest 00000001
Dec 29 22:33:51 linuxtest 00000001
Dec 29 22:33:51 linuxtest 00000006
Dec 29 22:33:51 linuxtest 00000005
Dec 29 22:33:51 linuxtest 00000001
Dec 29 22:33:51 linuxtest 00000001
Dec 29 22:33:51 linuxtest 00000006
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.037629] <0>
Dec 29 22:33:51 linuxtest 01000246
Dec 29 22:33:51 linuxtest 00000005
Dec 29 22:33:51 linuxtest 11b57b53
Dec 29 22:33:51 linuxtest c5405168
Dec 29 22:33:51 linuxtest c061df41
Dec 29 22:33:51 linuxtest 00000006
Dec 29 22:33:51 linuxtest 00000000
Dec 29 22:33:51 linuxtest 00000000
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.037887] Call Trace:
Dec 29 22:33:51 linuxtest [1188725.037975] [<c05fe931>] ? tcp_ack+0x1591/0x1778
Dec 29 22:33:51 linuxtest [1188725.038073] [<c061df41>] ?
ipt_do_table+0x2f8/0x310
Dec 29 22:33:51 linuxtest [1188725.038148] [<c05ff493>] ?
tcp_rcv_state_process+0x4db/0x7fc
Dec 29 22:33:51 linuxtest [1188725.038246] [<c0604e3d>] ?
tcp_v4_do_rcv+0x263/0x29d
Dec 29 22:33:51 linuxtest [1188725.038321] [<c023381a>] ?
local_bh_enable+0xb/0xd
Dec 29 22:33:51 linuxtest [1188725.038419] [<c05d4571>] ? sk_filter+0x5e/0x69
Dec 29 22:33:51 linuxtest [1188725.038510] [<c06059b4>] ?
tcp_v4_rcv+0x371/0x502
Dec 29 22:33:51 linuxtest [1188725.038607] [<c05ee78c>] ?
ip_local_deliver_finish+0x0/0x171
Dec 29 22:33:51 linuxtest [1188725.038684] [<c05ee88a>] ?
ip_local_deliver_finish+0xfe/0x171
Dec 29 22:33:51 linuxtest [1188725.038784] [<c05ee95e>] ?
ip_local_deliver+0x61/0x66
Dec 29 22:33:51 linuxtest [1188725.038876] [<c05ee531>] ?
ip_rcv_finish+0x289/0x2b1
Dec 29 22:33:51 linuxtest [1188725.038961] [<c05ee75c>] ? ip_rcv+0x203/0x233
Dec 29 22:33:51 linuxtest [1188725.039052] [<c05ca149>] ?
netif_receive_skb+0x335/0x350
Dec 29 22:33:51 linuxtest [1188725.039151] [<c05ca1c6>] ?
process_backlog+0x62/0x88
Dec 29 22:33:51 linuxtest [1188725.039242] [<c05ca6c5>] ?
net_rx_action+0x8e/0x16b
Dec 29 22:33:51 linuxtest [1188725.039333] [<c02335bb>] ?
__do_softirq+0xa7/0x148
Dec 29 22:33:51 linuxtest [1188725.039423] [<c0233682>] ? do_softirq+0x26/0x2b
Dec 29 22:33:51 linuxtest [1188725.039520] [<c0233764>] ? irq_exit+0x29/0x5c
Dec 29 22:33:51 linuxtest [1188725.039610] [<c0204365>] ? do_IRQ+0x81/0x95
Dec 29 22:33:51 linuxtest [1188725.039706] [<c0202ec9>] ?
common_interrupt+0x29/0x30
Dec 29 22:33:51 linuxtest [1188725.039797] [<c0208b74>] ?
default_idle+0x3e/0x5b
Dec 29 22:33:51 linuxtest [1188725.039895] [<c02479c9>] ?
clockevents_notify+0x60/0x65
Dec 29 22:33:51 linuxtest [1188725.039986] [<c0208c49>] ? c1e_idle+0xb8/0xd2
Dec 29 22:33:51 linuxtest [1188725.040058] [<c0201bba>] ? cpu_idle+0x45/0x5f
Dec 29 22:33:51 linuxtest [1188725.040131] [<c0643560>] ? rest_init+0x58/0x5a
Dec 29 22:33:51 linuxtest [1188725.040212] [<c084f7f9>] ?
start_kernel+0x2f0/0x2f5
Dec 29 22:33:51 linuxtest [1188725.040285] [<c084f070>] ?
i386_start_kernel+0x70/0x77
Dec 29 22:33:51 linuxtest [1188725.040381] Code:
Dec 29 22:33:51 linuxtest ec
Dec 29 22:33:51 linuxtest bd
Dec 29 22:33:51 linuxtest 84
Dec 29 22:33:51 linuxtest c0
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest 04
Dec 29 22:33:51 linuxtest 88
Dec 29 22:33:51 linuxtest 8b
Dec 29 22:33:51 linuxtest 55
Dec 29 22:33:51 linuxtest ec
Dec 29 22:33:51 linuxtest 8b
Dec 29 22:33:51 linuxtest 02
Dec 29 22:33:51 linuxtest 39
Dec 29 22:33:51 linuxtest d0
Dec 29 22:33:51 linuxtest ba
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 0f
Dec 29 22:33:51 linuxtest 44
Dec 29 22:33:51 linuxtest c2
Dec 29 22:33:51 linuxtest 39
Dec 29 22:33:51 linuxtest c6
Dec 29 22:33:51 linuxtest 75
Dec 29 22:33:51 linuxtest 0f
Dec 29 22:33:51 linuxtest 8b
Dec 29 22:33:51 linuxtest 8b
Dec 29 22:33:51 linuxtest 18
Dec 29 22:33:51 linuxtest 02
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest b2
Dec 29 22:33:51 linuxtest 01
Dec 29 22:33:51 linuxtest 89
Dec 29 22:33:51 linuxtest d8
Dec 29 22:33:51 linuxtest e8
Dec 29 22:33:51 linuxtest ee
Dec 29 22:33:51 linuxtest fd
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest 8b
Dec 29 22:33:51 linuxtest 36
Dec 29 13:33:50 linuxtest unparseable log message: "<8b> "
Dec 29 22:33:51 linuxtest 06
Dec 29 22:33:51 linuxtest 0f
Dec 29 22:33:51 linuxtest 18
Dec 29 22:33:51 linuxtest 00
Dec 29 22:33:51 linuxtest 90
Dec 29 22:33:51 linuxtest 3b
Dec 29 22:33:51 linuxtest 75
Dec 29 22:33:51 linuxtest ec
Dec 29 22:33:51 linuxtest 0f
Dec 29 22:33:51 linuxtest 85
Dec 29 22:33:51 linuxtest a9
Dec 29 22:33:51 linuxtest fe
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest eb
Dec 29 22:33:51 linuxtest 11
Dec 29 22:33:51 linuxtest 85
Dec 29 22:33:51 linuxtest ff
Dec 29 22:33:51 linuxtest 0f
Dec 29 22:33:51 linuxtest 84
Dec 29 22:33:51 linuxtest
Dec 29 22:33:51 linuxtest [1188725.040771] EIP: [<c060164a>]
Dec 29 22:33:51 linuxtest tcp_xmit_retransmit_queue+0x1b2/0x1dc
Dec 29 22:33:51 linuxtest SS:ESP 0068:c0805d0c
Dec 29 22:33:51 linuxtest [1188725.040929] CR2: 0000000000000000
Dec 29 22:33:51 linuxtest [1188725.041346] ---[ end trace 1b9e8ae01c5d5485 ]---
Dec 29 22:33:51 linuxtest [1188725.042940] Kernel panic - not syncing:
Fatal exception in interrupt
Dec 29 22:33:51 linuxtest [1188725.043076] Pid: 0, comm: swapper
Tainted: G D 2.6.31.6-v03 #2
Dec 29 22:33:51 linuxtest [1188725.043188] Call Trace:
Dec 29 22:33:51 linuxtest [1188725.043318] [<c066812b>] ? printk+0xf/0x11
Dec 29 22:33:51 linuxtest [1188725.043441] [<c066807f>] panic+0x39/0xd6
Dec 29 22:33:51 linuxtest [1188725.043558] [<c0205811>] oops_end+0x8b/0x9a
Dec 29 22:33:51 linuxtest [1188725.043683] [<c021c974>] no_context+0x13c/0x146
Dec 29 22:33:51 linuxtest [1188725.043814] [<c021ca91>]
__bad_area_nosemaphore+0x113/0x11b
Dec 29 22:33:51 linuxtest [1188725.043943] [<c0553967>] ?
nv_start_xmit_optimized+0x3d4/0x401
Dec 29 22:33:51 linuxtest [1188725.044073] [<c02253b2>] ?
__enqueue_entity+0x8d/0x95
Dec 29 22:33:51 linuxtest [1188725.044182] [<c021caa6>]
bad_area_nosemaphore+0xd/0x10
Dec 29 22:33:51 linuxtest [1188725.044319] [<c021cce3>]
do_page_fault+0x108/0x265
Dec 29 22:33:51 linuxtest [1188725.044444] [<c0223993>] ?
enqueue_task+0x72/0x7f
Dec 29 22:33:51 linuxtest [1188725.044562] [<c021cbdb>] ?
do_page_fault+0x0/0x265
Dec 29 22:33:51 linuxtest [1188725.044686] [<c0669b86>] error_code+0x66/0x6c
Dec 29 22:33:51 linuxtest [1188725.044817] [<c021cbdb>] ?
do_page_fault+0x0/0x265
Dec 29 22:33:51 linuxtest [1188725.044944] [<c060164a>] ?
tcp_xmit_retransmit_queue+0x1b2/0x1dc
Dec 29 22:33:51 linuxtest [1188725.045077] [<c05fe931>] tcp_ack+0x1591/0x1778
Dec 29 22:33:51 linuxtest [1188725.045201] [<c061df41>] ?
ipt_do_table+0x2f8/0x310
Dec 29 22:33:51 linuxtest [1188725.045332] [<c05ff493>]
tcp_rcv_state_process+0x4db/0x7fc
Dec 29 22:33:51 linuxtest [1188725.045442] [<c0604e3d>]
tcp_v4_do_rcv+0x263/0x29d
Dec 29 22:33:51 linuxtest [1188725.045567] [<c023381a>] ?
local_bh_enable+0xb/0xd
Dec 29 22:33:51 linuxtest [1188725.045694] [<c05d4571>] ? sk_filter+0x5e/0x69
Dec 29 22:33:51 linuxtest [1188725.045802] [<c06059b4>] tcp_v4_rcv+0x371/0x502
Dec 29 22:33:51 linuxtest [1188725.045911] [<c05ee78c>] ?
ip_local_deliver_finish+0x0/0x171
Dec 29 22:33:51 linuxtest [1188725.046045] [<c05ee88a>]
ip_local_deliver_finish+0xfe/0x171
Dec 29 22:33:51 linuxtest [1188725.046155] [<c05ee95e>]
ip_local_deliver+0x61/0x66
Dec 29 22:33:51 linuxtest [1188725.046301] [<c05ee531>]
ip_rcv_finish+0x289/0x2b1
Dec 29 22:33:51 linuxtest [1188725.046429] [<c05ee75c>] ip_rcv+0x203/0x233
Dec 29 22:33:51 linuxtest [1188725.046555] [<c05ca149>]
netif_receive_skb+0x335/0x350
Dec 29 22:33:51 linuxtest [1188725.046664] [<c05ca1c6>]
process_backlog+0x62/0x88
Dec 29 22:33:51 linuxtest [1188725.046809] [<c05ca6c5>]
net_rx_action+0x8e/0x16b
Dec 29 22:33:51 linuxtest [1188725.046917] [<c02335bb>] __do_softirq+0xa7/0x148
Dec 29 22:33:51 linuxtest [1188725.047041] [<c0233682>] do_softirq+0x26/0x2b
Dec 29 22:33:51 linuxtest [1188725.047162] [<c0233764>] irq_exit+0x29/0x5c
Dec 29 22:33:51 linuxtest [1188725.047285] [<c0204365>] do_IRQ+0x81/0x95
Dec 29 22:33:51 linuxtest [1188725.047409] [<c0202ec9>]
common_interrupt+0x29/0x30
Dec 29 22:33:51 linuxtest [1188725.047536] [<c0208b74>] ?
default_idle+0x3e/0x5b
Dec 29 22:33:51 linuxtest [1188725.047664] [<c02479c9>] ?
clockevents_notify+0x60/0x65
Dec 29 22:33:51 linuxtest [1188725.047790] [<c0208c49>] c1e_idle+0xb8/0xd2
Dec 29 22:33:51 linuxtest [1188725.047913] [<c0201bba>] cpu_idle+0x45/0x5f
Dec 29 22:33:51 linuxtest [1188725.048030] [<c0643560>] rest_init+0x58/0x5a
Dec 29 22:33:51 linuxtest [1188725.048153] [<c084f7f9>]
start_kernel+0x2f0/0x2f5
Dec 29 22:33:51 linuxtest [1188725.048271] [<c084f070>]
i386_start_kernel+0x70/0x77
Dec 29 22:33:51 linuxtest [1188725.048404] Rebooting in 10 seconds..
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-01-19 16:13 Panic at tcp_xmit_retransmit_queue sbs
@ 2010-01-19 19:36 ` sbs
2010-02-01 14:45 ` sbs
1 sibling, 0 replies; 7+ messages in thread
From: sbs @ 2010-01-19 19:36 UTC (permalink / raw)
To: netdev, linux-kernel
seems that i found a bug.
it was a problem with nvidia card(forcedeth):
00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
and dynamic netconsole compiled into the kernel:
CONFIG_NETCONSOLE=y
CONFIG_NETCONSOLE_DYNAMIC=y
but need to check it though.
On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
> We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
> it appears usualy under a high network trafic ( around 10000Mbit/s) but
> it is not a rule, it has happened even on low trafic.
>
> Servers are used as nginx+static content
> On 2 equal servers this panic happens aprox 2 times a day depending on
> network load. Machine completly freezes till the netconsole reboots.
>
> Kernel: 2.6.32.3
>
> what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
> can anyone explain or fix?
>
> Panic output:
>
> Dec 29 22:33:51 linuxtest [1188725.037019] BUG: unable to handle kernel
> Dec 29 22:33:51 linuxtest NULL pointer dereference
> Dec 29 22:33:51 linuxtest at (null)
> Dec 29 22:33:51 linuxtest [1188725.037042] IP:
> Dec 29 22:33:51 linuxtest [<c060164a>] tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.037064] *pdpt = 00000000229c2001
> Dec 29 22:33:51 linuxtest *pde = 0000000000000000
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037080] Thread overran stack, or
> stack corrupted
> Dec 29 22:33:51 linuxtest [1188725.037091] Oops: 0000 [#1]
> Dec 29 22:33:51 linuxtest SMP
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037104] last sysfs file:
> /sys/devices/pci0000:00/0000:00:0f.0/0000:07:00.0/0000:08:01.0/0000:09:00.0/class
> Dec 29 22:33:51 linuxtest [1188725.037124]
> Dec 29 22:33:51 linuxtest [1188725.037131] Pid: 0, comm: swapper Not
> tainted (2.6.31.6-v03 #2) H8DMU
> Dec 29 22:33:51 linuxtest [1188725.037145] EIP: 0060:[<c060164a>]
> EFLAGS: 00010246 CPU: 0
> Dec 29 22:33:51 linuxtest [1188725.037158] EIP is at
> tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.037170] EAX: c540513c EBX: c54050c0
> ECX: 0e377f15 EDX: c540513c
> Dec 29 22:33:51 linuxtest [1188725.037183] ESI: 00000000 EDI: 00000000
> EBP: c0805d28 ESP: c0805d0c
> Dec 29 22:33:51 linuxtest [1188725.037196] DS: 007b ES: 007b FS: 00d8
> GS: 0000 SS: 0068
> Dec 29 22:33:51 linuxtest [1188725.037208] Process swapper (pid: 0,
> ti=c0804000 task=c080b5a0 task.ti=c0804000)
> Dec 29 22:33:51 linuxtest [1188725.037285] Stack:
> Dec 29 22:33:51 linuxtest [1188725.037368] 00000202
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest c540513c
> Dec 29 22:33:51 linuxtest 0e377f14
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest c54050c0
> Dec 29 22:33:51 linuxtest 0000050e
> Dec 29 22:33:51 linuxtest c0805da8
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037472] <0>
> Dec 29 22:33:51 linuxtest c05fe931
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest 00000005
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037629] <0>
> Dec 29 22:33:51 linuxtest 01000246
> Dec 29 22:33:51 linuxtest 00000005
> Dec 29 22:33:51 linuxtest 11b57b53
> Dec 29 22:33:51 linuxtest c5405168
> Dec 29 22:33:51 linuxtest c061df41
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037887] Call Trace:
> Dec 29 22:33:51 linuxtest [1188725.037975] [<c05fe931>] ? tcp_ack+0x1591/0x1778
> Dec 29 22:33:51 linuxtest [1188725.038073] [<c061df41>] ?
> ipt_do_table+0x2f8/0x310
> Dec 29 22:33:51 linuxtest [1188725.038148] [<c05ff493>] ?
> tcp_rcv_state_process+0x4db/0x7fc
> Dec 29 22:33:51 linuxtest [1188725.038246] [<c0604e3d>] ?
> tcp_v4_do_rcv+0x263/0x29d
> Dec 29 22:33:51 linuxtest [1188725.038321] [<c023381a>] ?
> local_bh_enable+0xb/0xd
> Dec 29 22:33:51 linuxtest [1188725.038419] [<c05d4571>] ? sk_filter+0x5e/0x69
> Dec 29 22:33:51 linuxtest [1188725.038510] [<c06059b4>] ?
> tcp_v4_rcv+0x371/0x502
> Dec 29 22:33:51 linuxtest [1188725.038607] [<c05ee78c>] ?
> ip_local_deliver_finish+0x0/0x171
> Dec 29 22:33:51 linuxtest [1188725.038684] [<c05ee88a>] ?
> ip_local_deliver_finish+0xfe/0x171
> Dec 29 22:33:51 linuxtest [1188725.038784] [<c05ee95e>] ?
> ip_local_deliver+0x61/0x66
> Dec 29 22:33:51 linuxtest [1188725.038876] [<c05ee531>] ?
> ip_rcv_finish+0x289/0x2b1
> Dec 29 22:33:51 linuxtest [1188725.038961] [<c05ee75c>] ? ip_rcv+0x203/0x233
> Dec 29 22:33:51 linuxtest [1188725.039052] [<c05ca149>] ?
> netif_receive_skb+0x335/0x350
> Dec 29 22:33:51 linuxtest [1188725.039151] [<c05ca1c6>] ?
> process_backlog+0x62/0x88
> Dec 29 22:33:51 linuxtest [1188725.039242] [<c05ca6c5>] ?
> net_rx_action+0x8e/0x16b
> Dec 29 22:33:51 linuxtest [1188725.039333] [<c02335bb>] ?
> __do_softirq+0xa7/0x148
> Dec 29 22:33:51 linuxtest [1188725.039423] [<c0233682>] ? do_softirq+0x26/0x2b
> Dec 29 22:33:51 linuxtest [1188725.039520] [<c0233764>] ? irq_exit+0x29/0x5c
> Dec 29 22:33:51 linuxtest [1188725.039610] [<c0204365>] ? do_IRQ+0x81/0x95
> Dec 29 22:33:51 linuxtest [1188725.039706] [<c0202ec9>] ?
> common_interrupt+0x29/0x30
> Dec 29 22:33:51 linuxtest [1188725.039797] [<c0208b74>] ?
> default_idle+0x3e/0x5b
> Dec 29 22:33:51 linuxtest [1188725.039895] [<c02479c9>] ?
> clockevents_notify+0x60/0x65
> Dec 29 22:33:51 linuxtest [1188725.039986] [<c0208c49>] ? c1e_idle+0xb8/0xd2
> Dec 29 22:33:51 linuxtest [1188725.040058] [<c0201bba>] ? cpu_idle+0x45/0x5f
> Dec 29 22:33:51 linuxtest [1188725.040131] [<c0643560>] ? rest_init+0x58/0x5a
> Dec 29 22:33:51 linuxtest [1188725.040212] [<c084f7f9>] ?
> start_kernel+0x2f0/0x2f5
> Dec 29 22:33:51 linuxtest [1188725.040285] [<c084f070>] ?
> i386_start_kernel+0x70/0x77
> Dec 29 22:33:51 linuxtest [1188725.040381] Code:
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest bd
> Dec 29 22:33:51 linuxtest 84
> Dec 29 22:33:51 linuxtest c0
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 04
> Dec 29 22:33:51 linuxtest 88
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 55
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 02
> Dec 29 22:33:51 linuxtest 39
> Dec 29 22:33:51 linuxtest d0
> Dec 29 22:33:51 linuxtest ba
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 44
> Dec 29 22:33:51 linuxtest c2
> Dec 29 22:33:51 linuxtest 39
> Dec 29 22:33:51 linuxtest c6
> Dec 29 22:33:51 linuxtest 75
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 18
> Dec 29 22:33:51 linuxtest 02
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest b2
> Dec 29 22:33:51 linuxtest 01
> Dec 29 22:33:51 linuxtest 89
> Dec 29 22:33:51 linuxtest d8
> Dec 29 22:33:51 linuxtest e8
> Dec 29 22:33:51 linuxtest ee
> Dec 29 22:33:51 linuxtest fd
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 36
> Dec 29 13:33:50 linuxtest unparseable log message: "<8b> "
> Dec 29 22:33:51 linuxtest 06
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 18
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 90
> Dec 29 22:33:51 linuxtest 3b
> Dec 29 22:33:51 linuxtest 75
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 85
> Dec 29 22:33:51 linuxtest a9
> Dec 29 22:33:51 linuxtest fe
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest eb
> Dec 29 22:33:51 linuxtest 11
> Dec 29 22:33:51 linuxtest 85
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 84
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.040771] EIP: [<c060164a>]
> Dec 29 22:33:51 linuxtest tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest SS:ESP 0068:c0805d0c
> Dec 29 22:33:51 linuxtest [1188725.040929] CR2: 0000000000000000
> Dec 29 22:33:51 linuxtest [1188725.041346] ---[ end trace 1b9e8ae01c5d5485 ]---
> Dec 29 22:33:51 linuxtest [1188725.042940] Kernel panic - not syncing:
> Fatal exception in interrupt
> Dec 29 22:33:51 linuxtest [1188725.043076] Pid: 0, comm: swapper
> Tainted: G D 2.6.31.6-v03 #2
> Dec 29 22:33:51 linuxtest [1188725.043188] Call Trace:
> Dec 29 22:33:51 linuxtest [1188725.043318] [<c066812b>] ? printk+0xf/0x11
> Dec 29 22:33:51 linuxtest [1188725.043441] [<c066807f>] panic+0x39/0xd6
> Dec 29 22:33:51 linuxtest [1188725.043558] [<c0205811>] oops_end+0x8b/0x9a
> Dec 29 22:33:51 linuxtest [1188725.043683] [<c021c974>] no_context+0x13c/0x146
> Dec 29 22:33:51 linuxtest [1188725.043814] [<c021ca91>]
> __bad_area_nosemaphore+0x113/0x11b
> Dec 29 22:33:51 linuxtest [1188725.043943] [<c0553967>] ?
> nv_start_xmit_optimized+0x3d4/0x401
> Dec 29 22:33:51 linuxtest [1188725.044073] [<c02253b2>] ?
> __enqueue_entity+0x8d/0x95
> Dec 29 22:33:51 linuxtest [1188725.044182] [<c021caa6>]
> bad_area_nosemaphore+0xd/0x10
> Dec 29 22:33:51 linuxtest [1188725.044319] [<c021cce3>]
> do_page_fault+0x108/0x265
> Dec 29 22:33:51 linuxtest [1188725.044444] [<c0223993>] ?
> enqueue_task+0x72/0x7f
> Dec 29 22:33:51 linuxtest [1188725.044562] [<c021cbdb>] ?
> do_page_fault+0x0/0x265
> Dec 29 22:33:51 linuxtest [1188725.044686] [<c0669b86>] error_code+0x66/0x6c
> Dec 29 22:33:51 linuxtest [1188725.044817] [<c021cbdb>] ?
> do_page_fault+0x0/0x265
> Dec 29 22:33:51 linuxtest [1188725.044944] [<c060164a>] ?
> tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.045077] [<c05fe931>] tcp_ack+0x1591/0x1778
> Dec 29 22:33:51 linuxtest [1188725.045201] [<c061df41>] ?
> ipt_do_table+0x2f8/0x310
> Dec 29 22:33:51 linuxtest [1188725.045332] [<c05ff493>]
> tcp_rcv_state_process+0x4db/0x7fc
> Dec 29 22:33:51 linuxtest [1188725.045442] [<c0604e3d>]
> tcp_v4_do_rcv+0x263/0x29d
> Dec 29 22:33:51 linuxtest [1188725.045567] [<c023381a>] ?
> local_bh_enable+0xb/0xd
> Dec 29 22:33:51 linuxtest [1188725.045694] [<c05d4571>] ? sk_filter+0x5e/0x69
> Dec 29 22:33:51 linuxtest [1188725.045802] [<c06059b4>] tcp_v4_rcv+0x371/0x502
> Dec 29 22:33:51 linuxtest [1188725.045911] [<c05ee78c>] ?
> ip_local_deliver_finish+0x0/0x171
> Dec 29 22:33:51 linuxtest [1188725.046045] [<c05ee88a>]
> ip_local_deliver_finish+0xfe/0x171
> Dec 29 22:33:51 linuxtest [1188725.046155] [<c05ee95e>]
> ip_local_deliver+0x61/0x66
> Dec 29 22:33:51 linuxtest [1188725.046301] [<c05ee531>]
> ip_rcv_finish+0x289/0x2b1
> Dec 29 22:33:51 linuxtest [1188725.046429] [<c05ee75c>] ip_rcv+0x203/0x233
> Dec 29 22:33:51 linuxtest [1188725.046555] [<c05ca149>]
> netif_receive_skb+0x335/0x350
> Dec 29 22:33:51 linuxtest [1188725.046664] [<c05ca1c6>]
> process_backlog+0x62/0x88
> Dec 29 22:33:51 linuxtest [1188725.046809] [<c05ca6c5>]
> net_rx_action+0x8e/0x16b
> Dec 29 22:33:51 linuxtest [1188725.046917] [<c02335bb>] __do_softirq+0xa7/0x148
> Dec 29 22:33:51 linuxtest [1188725.047041] [<c0233682>] do_softirq+0x26/0x2b
> Dec 29 22:33:51 linuxtest [1188725.047162] [<c0233764>] irq_exit+0x29/0x5c
> Dec 29 22:33:51 linuxtest [1188725.047285] [<c0204365>] do_IRQ+0x81/0x95
> Dec 29 22:33:51 linuxtest [1188725.047409] [<c0202ec9>]
> common_interrupt+0x29/0x30
> Dec 29 22:33:51 linuxtest [1188725.047536] [<c0208b74>] ?
> default_idle+0x3e/0x5b
> Dec 29 22:33:51 linuxtest [1188725.047664] [<c02479c9>] ?
> clockevents_notify+0x60/0x65
> Dec 29 22:33:51 linuxtest [1188725.047790] [<c0208c49>] c1e_idle+0xb8/0xd2
> Dec 29 22:33:51 linuxtest [1188725.047913] [<c0201bba>] cpu_idle+0x45/0x5f
> Dec 29 22:33:51 linuxtest [1188725.048030] [<c0643560>] rest_init+0x58/0x5a
> Dec 29 22:33:51 linuxtest [1188725.048153] [<c084f7f9>]
> start_kernel+0x2f0/0x2f5
> Dec 29 22:33:51 linuxtest [1188725.048271] [<c084f070>]
> i386_start_kernel+0x70/0x77
> Dec 29 22:33:51 linuxtest [1188725.048404] Rebooting in 10 seconds..
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-01-19 16:13 Panic at tcp_xmit_retransmit_queue sbs
2010-01-19 19:36 ` sbs
@ 2010-02-01 14:45 ` sbs
2010-02-03 11:02 ` Ilpo Järvinen
1 sibling, 1 reply; 7+ messages in thread
From: sbs @ 2010-02-01 14:45 UTC (permalink / raw)
To: netdev, linux-kernel
actually removing netconsole from kernel didnt help.
i found many guys with the same problem but with different hardware
configurations here:
freez in TCP stack :
http://bugzilla.kernel.org/show_bug.cgi?id=14470
is there someone who can investigate it?
On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
> We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
> it appears usualy under a high network trafic ( around 10000Mbit/s) but
> it is not a rule, it has happened even on low trafic.
>
> Servers are used as nginx+static content
> On 2 equal servers this panic happens aprox 2 times a day depending on
> network load. Machine completly freezes till the netconsole reboots.
>
> Kernel: 2.6.32.3
>
> what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
> can anyone explain or fix?
>
> Panic output:
>
> Dec 29 22:33:51 linuxtest [1188725.037019] BUG: unable to handle kernel
> Dec 29 22:33:51 linuxtest NULL pointer dereference
> Dec 29 22:33:51 linuxtest at (null)
> Dec 29 22:33:51 linuxtest [1188725.037042] IP:
> Dec 29 22:33:51 linuxtest [<c060164a>] tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.037064] *pdpt = 00000000229c2001
> Dec 29 22:33:51 linuxtest *pde = 0000000000000000
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037080] Thread overran stack, or
> stack corrupted
> Dec 29 22:33:51 linuxtest [1188725.037091] Oops: 0000 [#1]
> Dec 29 22:33:51 linuxtest SMP
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037104] last sysfs file:
> /sys/devices/pci0000:00/0000:00:0f.0/0000:07:00.0/0000:08:01.0/0000:09:00.0/class
> Dec 29 22:33:51 linuxtest [1188725.037124]
> Dec 29 22:33:51 linuxtest [1188725.037131] Pid: 0, comm: swapper Not
> tainted (2.6.31.6-v03 #2) H8DMU
> Dec 29 22:33:51 linuxtest [1188725.037145] EIP: 0060:[<c060164a>]
> EFLAGS: 00010246 CPU: 0
> Dec 29 22:33:51 linuxtest [1188725.037158] EIP is at
> tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.037170] EAX: c540513c EBX: c54050c0
> ECX: 0e377f15 EDX: c540513c
> Dec 29 22:33:51 linuxtest [1188725.037183] ESI: 00000000 EDI: 00000000
> EBP: c0805d28 ESP: c0805d0c
> Dec 29 22:33:51 linuxtest [1188725.037196] DS: 007b ES: 007b FS: 00d8
> GS: 0000 SS: 0068
> Dec 29 22:33:51 linuxtest [1188725.037208] Process swapper (pid: 0,
> ti=c0804000 task=c080b5a0 task.ti=c0804000)
> Dec 29 22:33:51 linuxtest [1188725.037285] Stack:
> Dec 29 22:33:51 linuxtest [1188725.037368] 00000202
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest c540513c
> Dec 29 22:33:51 linuxtest 0e377f14
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest c54050c0
> Dec 29 22:33:51 linuxtest 0000050e
> Dec 29 22:33:51 linuxtest c0805da8
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037472] <0>
> Dec 29 22:33:51 linuxtest c05fe931
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest 00000005
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000001
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037629] <0>
> Dec 29 22:33:51 linuxtest 01000246
> Dec 29 22:33:51 linuxtest 00000005
> Dec 29 22:33:51 linuxtest 11b57b53
> Dec 29 22:33:51 linuxtest c5405168
> Dec 29 22:33:51 linuxtest c061df41
> Dec 29 22:33:51 linuxtest 00000006
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest 00000000
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.037887] Call Trace:
> Dec 29 22:33:51 linuxtest [1188725.037975] [<c05fe931>] ? tcp_ack+0x1591/0x1778
> Dec 29 22:33:51 linuxtest [1188725.038073] [<c061df41>] ?
> ipt_do_table+0x2f8/0x310
> Dec 29 22:33:51 linuxtest [1188725.038148] [<c05ff493>] ?
> tcp_rcv_state_process+0x4db/0x7fc
> Dec 29 22:33:51 linuxtest [1188725.038246] [<c0604e3d>] ?
> tcp_v4_do_rcv+0x263/0x29d
> Dec 29 22:33:51 linuxtest [1188725.038321] [<c023381a>] ?
> local_bh_enable+0xb/0xd
> Dec 29 22:33:51 linuxtest [1188725.038419] [<c05d4571>] ? sk_filter+0x5e/0x69
> Dec 29 22:33:51 linuxtest [1188725.038510] [<c06059b4>] ?
> tcp_v4_rcv+0x371/0x502
> Dec 29 22:33:51 linuxtest [1188725.038607] [<c05ee78c>] ?
> ip_local_deliver_finish+0x0/0x171
> Dec 29 22:33:51 linuxtest [1188725.038684] [<c05ee88a>] ?
> ip_local_deliver_finish+0xfe/0x171
> Dec 29 22:33:51 linuxtest [1188725.038784] [<c05ee95e>] ?
> ip_local_deliver+0x61/0x66
> Dec 29 22:33:51 linuxtest [1188725.038876] [<c05ee531>] ?
> ip_rcv_finish+0x289/0x2b1
> Dec 29 22:33:51 linuxtest [1188725.038961] [<c05ee75c>] ? ip_rcv+0x203/0x233
> Dec 29 22:33:51 linuxtest [1188725.039052] [<c05ca149>] ?
> netif_receive_skb+0x335/0x350
> Dec 29 22:33:51 linuxtest [1188725.039151] [<c05ca1c6>] ?
> process_backlog+0x62/0x88
> Dec 29 22:33:51 linuxtest [1188725.039242] [<c05ca6c5>] ?
> net_rx_action+0x8e/0x16b
> Dec 29 22:33:51 linuxtest [1188725.039333] [<c02335bb>] ?
> __do_softirq+0xa7/0x148
> Dec 29 22:33:51 linuxtest [1188725.039423] [<c0233682>] ? do_softirq+0x26/0x2b
> Dec 29 22:33:51 linuxtest [1188725.039520] [<c0233764>] ? irq_exit+0x29/0x5c
> Dec 29 22:33:51 linuxtest [1188725.039610] [<c0204365>] ? do_IRQ+0x81/0x95
> Dec 29 22:33:51 linuxtest [1188725.039706] [<c0202ec9>] ?
> common_interrupt+0x29/0x30
> Dec 29 22:33:51 linuxtest [1188725.039797] [<c0208b74>] ?
> default_idle+0x3e/0x5b
> Dec 29 22:33:51 linuxtest [1188725.039895] [<c02479c9>] ?
> clockevents_notify+0x60/0x65
> Dec 29 22:33:51 linuxtest [1188725.039986] [<c0208c49>] ? c1e_idle+0xb8/0xd2
> Dec 29 22:33:51 linuxtest [1188725.040058] [<c0201bba>] ? cpu_idle+0x45/0x5f
> Dec 29 22:33:51 linuxtest [1188725.040131] [<c0643560>] ? rest_init+0x58/0x5a
> Dec 29 22:33:51 linuxtest [1188725.040212] [<c084f7f9>] ?
> start_kernel+0x2f0/0x2f5
> Dec 29 22:33:51 linuxtest [1188725.040285] [<c084f070>] ?
> i386_start_kernel+0x70/0x77
> Dec 29 22:33:51 linuxtest [1188725.040381] Code:
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest bd
> Dec 29 22:33:51 linuxtest 84
> Dec 29 22:33:51 linuxtest c0
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 04
> Dec 29 22:33:51 linuxtest 88
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 55
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 02
> Dec 29 22:33:51 linuxtest 39
> Dec 29 22:33:51 linuxtest d0
> Dec 29 22:33:51 linuxtest ba
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 44
> Dec 29 22:33:51 linuxtest c2
> Dec 29 22:33:51 linuxtest 39
> Dec 29 22:33:51 linuxtest c6
> Dec 29 22:33:51 linuxtest 75
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 18
> Dec 29 22:33:51 linuxtest 02
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest b2
> Dec 29 22:33:51 linuxtest 01
> Dec 29 22:33:51 linuxtest 89
> Dec 29 22:33:51 linuxtest d8
> Dec 29 22:33:51 linuxtest e8
> Dec 29 22:33:51 linuxtest ee
> Dec 29 22:33:51 linuxtest fd
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 8b
> Dec 29 22:33:51 linuxtest 36
> Dec 29 13:33:50 linuxtest unparseable log message: "<8b> "
> Dec 29 22:33:51 linuxtest 06
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 18
> Dec 29 22:33:51 linuxtest 00
> Dec 29 22:33:51 linuxtest 90
> Dec 29 22:33:51 linuxtest 3b
> Dec 29 22:33:51 linuxtest 75
> Dec 29 22:33:51 linuxtest ec
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 85
> Dec 29 22:33:51 linuxtest a9
> Dec 29 22:33:51 linuxtest fe
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest eb
> Dec 29 22:33:51 linuxtest 11
> Dec 29 22:33:51 linuxtest 85
> Dec 29 22:33:51 linuxtest ff
> Dec 29 22:33:51 linuxtest 0f
> Dec 29 22:33:51 linuxtest 84
> Dec 29 22:33:51 linuxtest
> Dec 29 22:33:51 linuxtest [1188725.040771] EIP: [<c060164a>]
> Dec 29 22:33:51 linuxtest tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest SS:ESP 0068:c0805d0c
> Dec 29 22:33:51 linuxtest [1188725.040929] CR2: 0000000000000000
> Dec 29 22:33:51 linuxtest [1188725.041346] ---[ end trace 1b9e8ae01c5d5485 ]---
> Dec 29 22:33:51 linuxtest [1188725.042940] Kernel panic - not syncing:
> Fatal exception in interrupt
> Dec 29 22:33:51 linuxtest [1188725.043076] Pid: 0, comm: swapper
> Tainted: G D 2.6.31.6-v03 #2
> Dec 29 22:33:51 linuxtest [1188725.043188] Call Trace:
> Dec 29 22:33:51 linuxtest [1188725.043318] [<c066812b>] ? printk+0xf/0x11
> Dec 29 22:33:51 linuxtest [1188725.043441] [<c066807f>] panic+0x39/0xd6
> Dec 29 22:33:51 linuxtest [1188725.043558] [<c0205811>] oops_end+0x8b/0x9a
> Dec 29 22:33:51 linuxtest [1188725.043683] [<c021c974>] no_context+0x13c/0x146
> Dec 29 22:33:51 linuxtest [1188725.043814] [<c021ca91>]
> __bad_area_nosemaphore+0x113/0x11b
> Dec 29 22:33:51 linuxtest [1188725.043943] [<c0553967>] ?
> nv_start_xmit_optimized+0x3d4/0x401
> Dec 29 22:33:51 linuxtest [1188725.044073] [<c02253b2>] ?
> __enqueue_entity+0x8d/0x95
> Dec 29 22:33:51 linuxtest [1188725.044182] [<c021caa6>]
> bad_area_nosemaphore+0xd/0x10
> Dec 29 22:33:51 linuxtest [1188725.044319] [<c021cce3>]
> do_page_fault+0x108/0x265
> Dec 29 22:33:51 linuxtest [1188725.044444] [<c0223993>] ?
> enqueue_task+0x72/0x7f
> Dec 29 22:33:51 linuxtest [1188725.044562] [<c021cbdb>] ?
> do_page_fault+0x0/0x265
> Dec 29 22:33:51 linuxtest [1188725.044686] [<c0669b86>] error_code+0x66/0x6c
> Dec 29 22:33:51 linuxtest [1188725.044817] [<c021cbdb>] ?
> do_page_fault+0x0/0x265
> Dec 29 22:33:51 linuxtest [1188725.044944] [<c060164a>] ?
> tcp_xmit_retransmit_queue+0x1b2/0x1dc
> Dec 29 22:33:51 linuxtest [1188725.045077] [<c05fe931>] tcp_ack+0x1591/0x1778
> Dec 29 22:33:51 linuxtest [1188725.045201] [<c061df41>] ?
> ipt_do_table+0x2f8/0x310
> Dec 29 22:33:51 linuxtest [1188725.045332] [<c05ff493>]
> tcp_rcv_state_process+0x4db/0x7fc
> Dec 29 22:33:51 linuxtest [1188725.045442] [<c0604e3d>]
> tcp_v4_do_rcv+0x263/0x29d
> Dec 29 22:33:51 linuxtest [1188725.045567] [<c023381a>] ?
> local_bh_enable+0xb/0xd
> Dec 29 22:33:51 linuxtest [1188725.045694] [<c05d4571>] ? sk_filter+0x5e/0x69
> Dec 29 22:33:51 linuxtest [1188725.045802] [<c06059b4>] tcp_v4_rcv+0x371/0x502
> Dec 29 22:33:51 linuxtest [1188725.045911] [<c05ee78c>] ?
> ip_local_deliver_finish+0x0/0x171
> Dec 29 22:33:51 linuxtest [1188725.046045] [<c05ee88a>]
> ip_local_deliver_finish+0xfe/0x171
> Dec 29 22:33:51 linuxtest [1188725.046155] [<c05ee95e>]
> ip_local_deliver+0x61/0x66
> Dec 29 22:33:51 linuxtest [1188725.046301] [<c05ee531>]
> ip_rcv_finish+0x289/0x2b1
> Dec 29 22:33:51 linuxtest [1188725.046429] [<c05ee75c>] ip_rcv+0x203/0x233
> Dec 29 22:33:51 linuxtest [1188725.046555] [<c05ca149>]
> netif_receive_skb+0x335/0x350
> Dec 29 22:33:51 linuxtest [1188725.046664] [<c05ca1c6>]
> process_backlog+0x62/0x88
> Dec 29 22:33:51 linuxtest [1188725.046809] [<c05ca6c5>]
> net_rx_action+0x8e/0x16b
> Dec 29 22:33:51 linuxtest [1188725.046917] [<c02335bb>] __do_softirq+0xa7/0x148
> Dec 29 22:33:51 linuxtest [1188725.047041] [<c0233682>] do_softirq+0x26/0x2b
> Dec 29 22:33:51 linuxtest [1188725.047162] [<c0233764>] irq_exit+0x29/0x5c
> Dec 29 22:33:51 linuxtest [1188725.047285] [<c0204365>] do_IRQ+0x81/0x95
> Dec 29 22:33:51 linuxtest [1188725.047409] [<c0202ec9>]
> common_interrupt+0x29/0x30
> Dec 29 22:33:51 linuxtest [1188725.047536] [<c0208b74>] ?
> default_idle+0x3e/0x5b
> Dec 29 22:33:51 linuxtest [1188725.047664] [<c02479c9>] ?
> clockevents_notify+0x60/0x65
> Dec 29 22:33:51 linuxtest [1188725.047790] [<c0208c49>] c1e_idle+0xb8/0xd2
> Dec 29 22:33:51 linuxtest [1188725.047913] [<c0201bba>] cpu_idle+0x45/0x5f
> Dec 29 22:33:51 linuxtest [1188725.048030] [<c0643560>] rest_init+0x58/0x5a
> Dec 29 22:33:51 linuxtest [1188725.048153] [<c084f7f9>]
> start_kernel+0x2f0/0x2f5
> Dec 29 22:33:51 linuxtest [1188725.048271] [<c084f070>]
> i386_start_kernel+0x70/0x77
> Dec 29 22:33:51 linuxtest [1188725.048404] Rebooting in 10 seconds..
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-02-01 14:45 ` sbs
@ 2010-02-03 11:02 ` Ilpo Järvinen
2010-02-15 13:21 ` Ilpo Järvinen
0 siblings, 1 reply; 7+ messages in thread
From: Ilpo Järvinen @ 2010-02-03 11:02 UTC (permalink / raw)
To: sbs; +Cc: Netdev, LKML
On Mon, 1 Feb 2010, sbs wrote:
> actually removing netconsole from kernel didnt help.
> i found many guys with the same problem but with different hardware
> configurations here:
>
> freez in TCP stack :
> http://bugzilla.kernel.org/show_bug.cgi?id=14470
>
> is there someone who can investigate it?
>
>
> On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
> > We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
> > it appears usualy under a high network trafic ( around 10000Mbit/s) but
> > it is not a rule, it has happened even on low trafic.
> >
> > Servers are used as nginx+static content
> > On 2 equal servers this panic happens aprox 2 times a day depending on
> > network load. Machine completly freezes till the netconsole reboots.
> >
> > Kernel: 2.6.32.3
> >
> > what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
> > can anyone explain or fix?
You might want to try with to debug patch below. It might even make the
box to survive the event (if I got it coded right).
--
i.
--
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 383ce23..f4600fb 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2186,6 +2186,42 @@ static int tcp_can_forward_retransmit(struct sock *sk)
return 1;
}
+static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *skb, *prev;
+
+ skb = tcp_write_queue_head(sk);
+ prev = (struct sk_buff *)(&sk->sk_write_queue);
+
+ if (skb == NULL) {
+ printk("NULL head, pkts %u\n", tp->packets_out);
+ return;
+ }
+ printk("head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
+ tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
+ tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
+ tp->retransmit_high);
+
+ while (skb) {
+ printk("skb %p (%u-%u) next %p prev %p sacked %u\n",
+ skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
+ skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
+ if (prev != skb->prev)
+ printk("Inconsistent prev\n");
+
+ if (skb == tcp_write_queue_tail(sk)) {
+ if (skb->next != (struct sk_buff *)(&sk->sk_write_queue))
+ printk("Improper next at tail\n");
+ return;
+ }
+
+ prev = skb;
+ skb = skb->next;
+ }
+ printk("Encountered unexpected NULL\n");
+}
+
/* This gets called after a retransmit timeout, and the initially
* retransmitted data is acknowledged. It tries to continue
* resending the rest of the retransmit queue, until either
@@ -2194,12 +2230,15 @@ static int tcp_can_forward_retransmit(struct sock *sk)
* based retransmit packet might feed us FACK information again.
* If so, we use it to avoid unnecessarily retransmissions.
*/
+static int caught_it = 0;
+
void tcp_xmit_retransmit_queue(struct sock *sk)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
struct sk_buff *hole = NULL;
+ struct sk_buff *old = tp->retransmit_skb_hint;
u32 last_lost;
int mib_idx;
int fwd_rexmitting = 0;
@@ -2217,6 +2256,16 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
last_lost = tp->snd_una;
}
+checknull:
+ if (skb == NULL) {
+ if (!caught_it)
+ print_queue(sk, old, hole);
+ caught_it++;
+ if (net_ratelimit())
+ printk("Errors caught so far %u\n", caught_it);
+ return;
+ }
+
tcp_for_write_queue_from(skb, sk) {
__u8 sacked = TCP_SKB_CB(skb)->sacked;
@@ -2257,7 +2306,7 @@ begin_fwd:
} else if (!(sacked & TCPCB_LOST)) {
if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
hole = skb;
- continue;
+ goto checknull;
} else {
last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2268,7 +2317,7 @@ begin_fwd:
}
if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
- continue;
+ goto checknull;
if (tcp_retransmit_skb(sk, skb))
return;
@@ -2278,6 +2327,7 @@ begin_fwd:
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
inet_csk(sk)->icsk_rto,
TCP_RTO_MAX);
+ goto checknull;
}
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-02-03 11:02 ` Ilpo Järvinen
@ 2010-02-15 13:21 ` Ilpo Järvinen
2010-02-18 10:28 ` Bruno Prémont
2010-03-02 13:16 ` sbs
0 siblings, 2 replies; 7+ messages in thread
From: Ilpo Järvinen @ 2010-02-15 13:21 UTC (permalink / raw)
To: sbs; +Cc: Netdev, LKML
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4306 bytes --]
On Wed, 3 Feb 2010, Ilpo Järvinen wrote:
> On Mon, 1 Feb 2010, sbs wrote:
>
> > actually removing netconsole from kernel didnt help.
> > i found many guys with the same problem but with different hardware
> > configurations here:
> >
> > freez in TCP stack :
> > http://bugzilla.kernel.org/show_bug.cgi?id=14470
> >
> > is there someone who can investigate it?
> >
> >
> > On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
> > > We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
> > > it appears usualy under a high network trafic ( around 10000Mbit/s) but
> > > it is not a rule, it has happened even on low trafic.
> > >
> > > Servers are used as nginx+static content
> > > On 2 equal servers this panic happens aprox 2 times a day depending on
> > > network load. Machine completly freezes till the netconsole reboots.
> > >
> > > Kernel: 2.6.32.3
> > >
> > > what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
> > > can anyone explain or fix?
>
> You might want to try with to debug patch below. It might even make the
> box to survive the event (if I got it coded right).
Here should be a better version of the debug patch, hopefully the infinite
looping is now gone.
--
i.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 383ce23..4672a30 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2186,6 +2186,42 @@ static int tcp_can_forward_retransmit(struct sock *sk)
return 1;
}
+static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *skb, *prev;
+
+ skb = tcp_write_queue_head(sk);
+ prev = (struct sk_buff *)(&sk->sk_write_queue);
+
+ if (skb == NULL) {
+ printk("NULL head, pkts %u\n", tp->packets_out);
+ return;
+ }
+ printk("head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
+ tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
+ tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
+ tp->retransmit_high);
+
+ while (skb) {
+ printk("skb %p (%u-%u) next %p prev %p sacked %u\n",
+ skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
+ skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
+ if (prev != skb->prev)
+ printk("Inconsistent prev\n");
+
+ if (skb == tcp_write_queue_tail(sk)) {
+ if (skb->next != (struct sk_buff *)(&sk->sk_write_queue))
+ printk("Improper next at tail\n");
+ return;
+ }
+
+ prev = skb;
+ skb = skb->next;
+ }
+ printk("Encountered unexpected NULL\n");
+}
+
/* This gets called after a retransmit timeout, and the initially
* retransmitted data is acknowledged. It tries to continue
* resending the rest of the retransmit queue, until either
@@ -2194,12 +2230,15 @@ static int tcp_can_forward_retransmit(struct sock *sk)
* based retransmit packet might feed us FACK information again.
* If so, we use it to avoid unnecessarily retransmissions.
*/
+static int caught_it = 0;
+
void tcp_xmit_retransmit_queue(struct sock *sk)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
struct sk_buff *hole = NULL;
+ struct sk_buff *old = tp->retransmit_skb_hint;
u32 last_lost;
int mib_idx;
int fwd_rexmitting = 0;
@@ -2217,6 +2256,16 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
last_lost = tp->snd_una;
}
+checknull:
+ if (skb == NULL) {
+ if (!caught_it)
+ print_queue(sk, old, hole);
+ caught_it++;
+ if (net_ratelimit())
+ printk("Errors caught so far %u\n", caught_it);
+ return;
+ }
+
tcp_for_write_queue_from(skb, sk) {
__u8 sacked = TCP_SKB_CB(skb)->sacked;
@@ -2257,7 +2306,7 @@ begin_fwd:
} else if (!(sacked & TCPCB_LOST)) {
if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
hole = skb;
- continue;
+ goto cont;
} else {
last_lost = TCP_SKB_CB(skb)->end_seq;
@@ -2268,7 +2317,7 @@ begin_fwd:
}
if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
- continue;
+ goto cont;
if (tcp_retransmit_skb(sk, skb))
return;
@@ -2278,6 +2327,9 @@ begin_fwd:
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
inet_csk(sk)->icsk_rto,
TCP_RTO_MAX);
+cont:
+ skb = skb->next;
+ goto checknull;
}
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-02-15 13:21 ` Ilpo Järvinen
@ 2010-02-18 10:28 ` Bruno Prémont
2010-03-02 13:16 ` sbs
1 sibling, 0 replies; 7+ messages in thread
From: Bruno Prémont @ 2010-02-18 10:28 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: sbs, Netdev, LKML
On Mon, 15 Feb 2010 15:21:58 "Ilpo Järvinen" wrote:
> On Wed, 3 Feb 2010, Ilpo Järvinen wrote:
>
> > On Mon, 1 Feb 2010, sbs wrote:
> >
> > > actually removing netconsole from kernel didnt help.
> > > i found many guys with the same problem but with different
> > > hardware configurations here:
> > >
> > > freez in TCP stack :
> > > http://bugzilla.kernel.org/show_bug.cgi?id=14470
> > >
> > > is there someone who can investigate it?
> > >
> > >
> > > On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
> > > > We are hiting kernel panics on servers with nVidia MCP55 NICs
> > > > once a day; it appears usualy under a high network trafic
> > > > ( around 10000Mbit/s) but it is not a rule, it has happened
> > > > even on low trafic.
> > > >
> > > > Servers are used as nginx+static content
> > > > On 2 equal servers this panic happens aprox 2 times a day
> > > > depending on network load. Machine completly freezes till the
> > > > netconsole reboots.
> > > >
> > > > Kernel: 2.6.32.3
> > > >
> > > > what can it be? whats wrong with tcp_xmit_retransmit_queue()
> > > > function ? can anyone explain or fix?
> >
> > You might want to try with to debug patch below. It might even make
> > the box to survive the event (if I got it coded right).
>
> Here should be a better version of the debug patch, hopefully the
> infinite looping is now gone.
I can reproduce the freeze pretty easily, even on an idle server,
all I need is netconsole enabled, an ssh connection to server and
permission to write to /proc/sysrq-trigger.
The following command, executed via SSH triggers the frozen system:
echo t > /proc/sysrq-trigger
when netconsole is enabled. Doing the same from local console has no
negative effect (idle system).
Unfortunately I can't get any useful information out of the system as
nothing reaches VGA console and interaction with the system is not
possible anymore (cursor is still blinking on VGA console).
Unfortunately I currently have no setup here to analyze dead system via
kexec crash kernel that would be run on watchdog.
System I'm using is HP Proliant DL360 G5 (4 logical CPUs, two sockets),
bnx2 NIC.
Eventually I will try with some other system to reproduce there as
well (to rule out NIC driver).
Any hints on how to get pertinent data out of that system would be
really nice!
Regards,
Bruno
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Panic at tcp_xmit_retransmit_queue
2010-02-15 13:21 ` Ilpo Järvinen
2010-02-18 10:28 ` Bruno Prémont
@ 2010-03-02 13:16 ` sbs
1 sibling, 0 replies; 7+ messages in thread
From: sbs @ 2010-03-02 13:16 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Netdev, LKML
thank you very much, have stable running server for a week and it
seems that it works like a charm now, i havent detected any panics
since i apply the patch. although seems that the problem stops
ocurring cause i dont see any debug information through netconsole
On Mon, Feb 15, 2010 at 4:21 PM, Ilpo Järvinen
<ilpo.jarvinen@helsinki.fi> wrote:
> On Wed, 3 Feb 2010, Ilpo Järvinen wrote:
>
>> On Mon, 1 Feb 2010, sbs wrote:
>>
>> > actually removing netconsole from kernel didnt help.
>> > i found many guys with the same problem but with different hardware
>> > configurations here:
>> >
>> > freez in TCP stack :
>> > http://bugzilla.kernel.org/show_bug.cgi?id=14470
>> >
>> > is there someone who can investigate it?
>> >
>> >
>> > On Tue, Jan 19, 2010 at 7:13 PM, sbs <gexlie@gmail.com> wrote:
>> > > We are hiting kernel panics on servers with nVidia MCP55 NICs once a day;
>> > > it appears usualy under a high network trafic ( around 10000Mbit/s) but
>> > > it is not a rule, it has happened even on low trafic.
>> > >
>> > > Servers are used as nginx+static content
>> > > On 2 equal servers this panic happens aprox 2 times a day depending on
>> > > network load. Machine completly freezes till the netconsole reboots.
>> > >
>> > > Kernel: 2.6.32.3
>> > >
>> > > what can it be? whats wrong with tcp_xmit_retransmit_queue() function ?
>> > > can anyone explain or fix?
>>
>> You might want to try with to debug patch below. It might even make the
>> box to survive the event (if I got it coded right).
>
> Here should be a better version of the debug patch, hopefully the infinite
> looping is now gone.
>
> --
> i.
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 383ce23..4672a30 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2186,6 +2186,42 @@ static int tcp_can_forward_retransmit(struct sock *sk)
> return 1;
> }
>
> +static void print_queue(struct sock *sk, struct sk_buff *old, struct sk_buff *hole)
> +{
> + struct tcp_sock *tp = tcp_sk(sk);
> + struct sk_buff *skb, *prev;
> +
> + skb = tcp_write_queue_head(sk);
> + prev = (struct sk_buff *)(&sk->sk_write_queue);
> +
> + if (skb == NULL) {
> + printk("NULL head, pkts %u\n", tp->packets_out);
> + return;
> + }
> + printk("head %p tail %p sendhead %p oldhint %p now %p hole %p high %u\n",
> + tcp_write_queue_head(sk), tcp_write_queue_tail(sk),
> + tcp_send_head(sk), old, tp->retransmit_skb_hint, hole,
> + tp->retransmit_high);
> +
> + while (skb) {
> + printk("skb %p (%u-%u) next %p prev %p sacked %u\n",
> + skb, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
> + skb->next, skb->prev, TCP_SKB_CB(skb)->sacked);
> + if (prev != skb->prev)
> + printk("Inconsistent prev\n");
> +
> + if (skb == tcp_write_queue_tail(sk)) {
> + if (skb->next != (struct sk_buff *)(&sk->sk_write_queue))
> + printk("Improper next at tail\n");
> + return;
> + }
> +
> + prev = skb;
> + skb = skb->next;
> + }
> + printk("Encountered unexpected NULL\n");
> +}
> +
> /* This gets called after a retransmit timeout, and the initially
> * retransmitted data is acknowledged. It tries to continue
> * resending the rest of the retransmit queue, until either
> @@ -2194,12 +2230,15 @@ static int tcp_can_forward_retransmit(struct sock *sk)
> * based retransmit packet might feed us FACK information again.
> * If so, we use it to avoid unnecessarily retransmissions.
> */
> +static int caught_it = 0;
> +
> void tcp_xmit_retransmit_queue(struct sock *sk)
> {
> const struct inet_connection_sock *icsk = inet_csk(sk);
> struct tcp_sock *tp = tcp_sk(sk);
> struct sk_buff *skb;
> struct sk_buff *hole = NULL;
> + struct sk_buff *old = tp->retransmit_skb_hint;
> u32 last_lost;
> int mib_idx;
> int fwd_rexmitting = 0;
> @@ -2217,6 +2256,16 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
> last_lost = tp->snd_una;
> }
>
> +checknull:
> + if (skb == NULL) {
> + if (!caught_it)
> + print_queue(sk, old, hole);
> + caught_it++;
> + if (net_ratelimit())
> + printk("Errors caught so far %u\n", caught_it);
> + return;
> + }
> +
> tcp_for_write_queue_from(skb, sk) {
> __u8 sacked = TCP_SKB_CB(skb)->sacked;
>
> @@ -2257,7 +2306,7 @@ begin_fwd:
> } else if (!(sacked & TCPCB_LOST)) {
> if (hole == NULL && !(sacked & (TCPCB_SACKED_RETRANS|TCPCB_SACKED_ACKED)))
> hole = skb;
> - continue;
> + goto cont;
>
> } else {
> last_lost = TCP_SKB_CB(skb)->end_seq;
> @@ -2268,7 +2317,7 @@ begin_fwd:
> }
>
> if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
> - continue;
> + goto cont;
>
> if (tcp_retransmit_skb(sk, skb))
> return;
> @@ -2278,6 +2327,9 @@ begin_fwd:
> inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
> inet_csk(sk)->icsk_rto,
> TCP_RTO_MAX);
> +cont:
> + skb = skb->next;
> + goto checknull;
> }
> }
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-03-02 13:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-19 16:13 Panic at tcp_xmit_retransmit_queue sbs
2010-01-19 19:36 ` sbs
2010-02-01 14:45 ` sbs
2010-02-03 11:02 ` Ilpo Järvinen
2010-02-15 13:21 ` Ilpo Järvinen
2010-02-18 10:28 ` Bruno Prémont
2010-03-02 13:16 ` sbs
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).