* Re: Fw: [Bug 14470] New: freez in TCP stack
From: Eric Dumazet @ 2009-10-29 5:59 UTC (permalink / raw)
To: David S. Miller
Cc: Andrew Morton, Stephen Hemminger, netdev, kolo, bugzilla-daemon
In-Reply-To: <4AE9298C.1000204@gmail.com>
Eric Dumazet a écrit :
> Andrew Morton a écrit :
>> On Mon, 26 Oct 2009 08:41:32 -0700
>> Stephen Hemminger <shemminger@linux-foundation.org> wrote:
>>
>>> Begin forwarded message:
>>>
>>> Date: Mon, 26 Oct 2009 12:47:22 GMT
>>> From: bugzilla-daemon@bugzilla.kernel.org
>>> To: shemminger@linux-foundation.org
>>> Subject: [Bug 14470] New: freez in TCP stack
>>>
>> Stephen, please retain the bugzilla and reporter email cc's when
>> forwarding a report to a mailing list.
>>
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=14470
>>>
>>> Summary: freez in TCP stack
>>> Product: Networking
>>> Version: 2.5
>>> Kernel Version: 2.6.31
>>> Platform: All
>>> OS/Version: Linux
>>> Tree: Mainline
>>> Status: NEW
>>> Severity: high
>>> Priority: P1
>>> Component: IPV4
>>> AssignedTo: shemminger@linux-foundation.org
>>> ReportedBy: kolo@albatani.cz
>>> Regression: No
>>>
>>>
>>> We are hiting kernel panics on Dell R610 servers with e1000e NICs; it apears
>>> usualy under a high network trafic ( around 100Mbit/s) but it is not a rule it
>>> has happened even on low trafic.
>>>
>>> Servers are used as reverse http proxy (varnish).
>>>
>>> On 6 equal servers this panic happens aprox 2 times a day depending on network
>>> load. Machine completly freezes till the management watchdog reboots.
>>>
>> Twice a day on six separate machines. That ain't no hardware glitch.
>>
>> Vaclav, are you able to say whether this is a regression? Did those
>> machines run 2.6.30 (for example)?
>>
>> Thanks.
>>
>>> We had to put serial console on these servers to catch the oops. Is there
>>> anything else We can do to debug this?
>>> The RIP is always the same:
>>>
>>> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
>>> tcp_xmit_retransmit_queue+0x8c/0x290
>>>
>>> rest of the oops always differs a litle ... here is an example:
>>>
>>> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
>>> tcp_xmit_retransmit_queue+0x8c/0x290
>>> RSP: 0018:ffffc90000003a40 EFLAGS: 00010246
>>> RAX: ffff8807e7420678 RBX: ffff8807e74205c0 RCX: 0000000000000000
>>> RDX: 000000004598a105 RSI: 0000000000000000 RDI: ffff8807e74205c0
>>> RBP: ffffc90000003a80 R08: 0000000000000003 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>> R13: ffff8807e74205c0 R14: ffff8807e7420678 R15: 0000000000000000
>>> FS: 0000000000000000(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>>> CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Process swapper (pid: 0, threadinfo ffffffff81608000, task ffffffff81631440)
>>> Stack:
>>> ffffc90000003a60 0000000000000000 4598a105e74205c0 000000004598a101
>>> <0> 000000000000050e ffff8807e74205c0 0000000000000003 0000000000000000
>>> <0> ffffc90000003b40 ffffffff8141ae4a ffff8807e7420678 0000000000000000
>>> Call Trace:
>>> <IRQ>
>>> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
>>> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
>>> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
>>> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
>>> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
>>> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
>>> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
>>> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
>>> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
>>> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
>>> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
>>> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
>>> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
>>> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
>>> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
>>> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
>>> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
>>> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
>>> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
>>> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
>>> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
>>> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
>>> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
>>> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
>>> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
>>> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
>>> <EOI>
>>> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
>>> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
>>> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
>>> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
>>> [<ffffffff81468db6>] ? rest_init+0x66/0x70
>>> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
>>> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
>>> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>>> Code: 00 eb 28 8b 83 d0 03 00 00 41 39 44 24 40 0f 89 00 01 00 00 41 0f b6 cd
>>> 41 bd 2f 00 00 00 83 e1 03 0f 84 fc 00 00 00 4d 8b 24 24 <49> 8b 04 24 4d 39 f4
>>> 0f 18 08 0f 84 d9 00 00 00 4c 3b a3 b8 01
>>> RIP [<ffffffff814203cc>] tcp_xmit_retransmit_queue+0x8c/0x290
>>> RSP <ffffc90000003a40>
>>> CR2: 0000000000000000
>>> ---[ end trace d97d99c9ae1d52cc ]---
>>> Kernel panic - not syncing: Fatal exception in interrupt
>>> Pid: 0, comm: swapper Tainted: G D 2.6.31 #2
>>> Call Trace:
>>> <IRQ> [<ffffffff8103cab0>] panic+0xa0/0x170
>>> [<ffffffff8100bb13>] ? ret_from_intr+0x0/0xa
>>> [<ffffffff8103c74e>] ? print_oops_end_marker+0x1e/0x20
>>> [<ffffffff8100f38e>] oops_end+0x9e/0xb0
>>> [<ffffffff81025b9a>] no_context+0x15a/0x250
>>> [<ffffffff81025e2b>] __bad_area_nosemaphore+0xdb/0x1c0
>>> [<ffffffff813e89e9>] ? dev_hard_start_xmit+0x269/0x2f0
>>> [<ffffffff81025fae>] bad_area_nosemaphore+0xe/0x10
>>> [<ffffffff8102639f>] do_page_fault+0x17f/0x260
>>> [<ffffffff8147eadf>] page_fault+0x1f/0x30
>>> [<ffffffff814203cc>] ? tcp_xmit_retransmit_queue+0x8c/0x290
>>> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
>>> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
>>> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
>>> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
>>> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
>>> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
>>> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
>>> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
>>> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
>>> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
>>> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
>>> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
>>> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
>>> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
>>> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
>>> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
>>> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
>>> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
>>> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
>>> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
>>> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
>>> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
>>> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
>>> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
>>> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
>>> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
>>> <EOI> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
>>> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
>>> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
>>> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
>>> [<ffffffff81468db6>] ? rest_init+0x66/0x70
>>> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
>>> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
>>> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>>>
>
>
> Code: 00 eb 28 8b 83 d0 03 00 00
> 41 39 44 24 40 cmp %eax,0x40(%r12)
> 0f 89 00 01 00 00 jns ...
> 41 0f b6 cd movzbl %r13b,%ecx
> 41 bd 2f 00 00 00 mov $0x2f000000,%r13d
> 83 e1 03 and $0x3,%ecx
> 0f 84 fc 00 00 00 je ...
> 4d 8b 24 24 mov (%r12),%r12 skb = skb->next
> <>49 8b 04 24 mov (%r12),%rax << NULL POINTER dereference >>
> 4d 39 f4 cmp %r14,%r12
> 0f 18 08 prefetcht0 (%rax)
> 0f 84 d9 00 00 00 je ...
> 4c 3b a3 b8 01 cmp
>
>
> crash is in
> void tcp_xmit_retransmit_queue(struct sock *sk)
> {
>
> << HERE >> tcp_for_write_queue_from(skb, sk) {
>
> }
>
>
> Some skb in sk_write_queue has a NULL ->next pointer
>
> Strange thing is R14 and RAX =ffff8807e7420678 (&sk->sk_write_queue)
> R14 is the stable value during the loop, while RAW is scratch register.
>
> I dont have full disassembly for this function, but I guess we just entered the loop
> (or RAX should be really different at this point)
>
> So, maybe list head itself is corrupted (sk->sk_write_queue->next = NULL)
>
> or, retransmit_skb_hint problem ? (we forget to set it to NULL in some cases ?)
>
David, what do you think of following patch ?
I wonder if we should reorganize code to add sanity checks in tcp_unlink_write_queue()
that the skb we delete from queue is not still referenced.
[PATCH] tcp: clear retrans hints in tcp_send_synack()
There is a small possibility the skb we unlink from write queue
is still referenced by retrans hints.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index fcd278a..b22a72d 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2201,6 +2201,7 @@ int tcp_send_synack(struct sock *sk)
struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
if (nskb == NULL)
return -ENOMEM;
+ tcp_clear_all_retrans_hints(tcp_sk(sk));
tcp_unlink_write_queue(skb, sk);
skb_header_release(nskb);
__tcp_add_write_queue_head(sk, nskb);
^ permalink raw reply related
* Re: [PATCH] vmxnet3: remove duplicate #include
From: David Miller @ 2009-10-29 5:52 UTC (permalink / raw)
To: sbhatewara; +Cc: netdev, weiyi.huang, pv-drivers
In-Reply-To: <20091028.222901.187567993.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Wed, 28 Oct 2009 22:29:01 -0700 (PDT)
> From: Shreyas Bhatewara <sbhatewara@vmware.com>
> Date: Wed, 28 Oct 2009 09:30:40 -0700 (PDT)
>
>>
>> Remove duplicate headerfile includes from vmxnet3_int.h
>>
>> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
>> Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
>> Signed-off-by: Bhavesh Davda <davda@vmware.com>
>
> Applied.
Guys, I'd like to remove the X86 Kconfig requirement for this
driver. There really isn't any x86 specific code in the
driver, it uses only standard PCI and networking APIs to function.
I know the virtual hardware won't be seen on other platforms,
but allowing the driver to get build tested on non-x86 platforms
helps me a lot. I do all of my build verifications on sparc64
for example.
^ permalink raw reply
* Re: Fw: [Bug 14470] New: freez in TCP stack
From: Eric Dumazet @ 2009-10-29 5:35 UTC (permalink / raw)
To: Andrew Morton; +Cc: Stephen Hemminger, netdev, kolo, bugzilla-daemon
In-Reply-To: <20091028151313.ba4a4d23.akpm@linux-foundation.org>
Andrew Morton a écrit :
> On Mon, 26 Oct 2009 08:41:32 -0700
> Stephen Hemminger <shemminger@linux-foundation.org> wrote:
>
>>
>> Begin forwarded message:
>>
>> Date: Mon, 26 Oct 2009 12:47:22 GMT
>> From: bugzilla-daemon@bugzilla.kernel.org
>> To: shemminger@linux-foundation.org
>> Subject: [Bug 14470] New: freez in TCP stack
>>
>
> Stephen, please retain the bugzilla and reporter email cc's when
> forwarding a report to a mailing list.
>
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=14470
>>
>> Summary: freez in TCP stack
>> Product: Networking
>> Version: 2.5
>> Kernel Version: 2.6.31
>> Platform: All
>> OS/Version: Linux
>> Tree: Mainline
>> Status: NEW
>> Severity: high
>> Priority: P1
>> Component: IPV4
>> AssignedTo: shemminger@linux-foundation.org
>> ReportedBy: kolo@albatani.cz
>> Regression: No
>>
>>
>> We are hiting kernel panics on Dell R610 servers with e1000e NICs; it apears
>> usualy under a high network trafic ( around 100Mbit/s) but it is not a rule it
>> has happened even on low trafic.
>>
>> Servers are used as reverse http proxy (varnish).
>>
>> On 6 equal servers this panic happens aprox 2 times a day depending on network
>> load. Machine completly freezes till the management watchdog reboots.
>>
>
> Twice a day on six separate machines. That ain't no hardware glitch.
>
> Vaclav, are you able to say whether this is a regression? Did those
> machines run 2.6.30 (for example)?
>
> Thanks.
>
>> We had to put serial console on these servers to catch the oops. Is there
>> anything else We can do to debug this?
>> The RIP is always the same:
>>
>> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
>> tcp_xmit_retransmit_queue+0x8c/0x290
>>
>> rest of the oops always differs a litle ... here is an example:
>>
>> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
>> tcp_xmit_retransmit_queue+0x8c/0x290
>> RSP: 0018:ffffc90000003a40 EFLAGS: 00010246
>> RAX: ffff8807e7420678 RBX: ffff8807e74205c0 RCX: 0000000000000000
>> RDX: 000000004598a105 RSI: 0000000000000000 RDI: ffff8807e74205c0
>> RBP: ffffc90000003a80 R08: 0000000000000003 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> R13: ffff8807e74205c0 R14: ffff8807e7420678 R15: 0000000000000000
>> FS: 0000000000000000(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>> CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper (pid: 0, threadinfo ffffffff81608000, task ffffffff81631440)
>> Stack:
>> ffffc90000003a60 0000000000000000 4598a105e74205c0 000000004598a101
>> <0> 000000000000050e ffff8807e74205c0 0000000000000003 0000000000000000
>> <0> ffffc90000003b40 ffffffff8141ae4a ffff8807e7420678 0000000000000000
>> Call Trace:
>> <IRQ>
>> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
>> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
>> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
>> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
>> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
>> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
>> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
>> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
>> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
>> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
>> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
>> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
>> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
>> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
>> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
>> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
>> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
>> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
>> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
>> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
>> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
>> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
>> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
>> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
>> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
>> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
>> <EOI>
>> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
>> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
>> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
>> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
>> [<ffffffff81468db6>] ? rest_init+0x66/0x70
>> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
>> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
>> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>> Code: 00 eb 28 8b 83 d0 03 00 00 41 39 44 24 40 0f 89 00 01 00 00 41 0f b6 cd
>> 41 bd 2f 00 00 00 83 e1 03 0f 84 fc 00 00 00 4d 8b 24 24 <49> 8b 04 24 4d 39 f4
>> 0f 18 08 0f 84 d9 00 00 00 4c 3b a3 b8 01
>> RIP [<ffffffff814203cc>] tcp_xmit_retransmit_queue+0x8c/0x290
>> RSP <ffffc90000003a40>
>> CR2: 0000000000000000
>> ---[ end trace d97d99c9ae1d52cc ]---
>> Kernel panic - not syncing: Fatal exception in interrupt
>> Pid: 0, comm: swapper Tainted: G D 2.6.31 #2
>> Call Trace:
>> <IRQ> [<ffffffff8103cab0>] panic+0xa0/0x170
>> [<ffffffff8100bb13>] ? ret_from_intr+0x0/0xa
>> [<ffffffff8103c74e>] ? print_oops_end_marker+0x1e/0x20
>> [<ffffffff8100f38e>] oops_end+0x9e/0xb0
>> [<ffffffff81025b9a>] no_context+0x15a/0x250
>> [<ffffffff81025e2b>] __bad_area_nosemaphore+0xdb/0x1c0
>> [<ffffffff813e89e9>] ? dev_hard_start_xmit+0x269/0x2f0
>> [<ffffffff81025fae>] bad_area_nosemaphore+0xe/0x10
>> [<ffffffff8102639f>] do_page_fault+0x17f/0x260
>> [<ffffffff8147eadf>] page_fault+0x1f/0x30
>> [<ffffffff814203cc>] ? tcp_xmit_retransmit_queue+0x8c/0x290
>> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
>> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
>> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
>> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
>> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
>> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
>> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
>> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
>> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
>> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
>> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
>> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
>> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
>> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
>> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
>> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
>> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
>> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
>> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
>> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
>> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
>> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
>> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
>> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
>> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
>> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
>> <EOI> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
>> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
>> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
>> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
>> [<ffffffff81468db6>] ? rest_init+0x66/0x70
>> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
>> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
>> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>>
Code: 00 eb 28 8b 83 d0 03 00 00
41 39 44 24 40 cmp %eax,0x40(%r12)
0f 89 00 01 00 00 jns ...
41 0f b6 cd movzbl %r13b,%ecx
41 bd 2f 00 00 00 mov $0x2f000000,%r13d
83 e1 03 and $0x3,%ecx
0f 84 fc 00 00 00 je ...
4d 8b 24 24 mov (%r12),%r12 skb = skb->next
<>49 8b 04 24 mov (%r12),%rax << NULL POINTER dereference >>
4d 39 f4 cmp %r14,%r12
0f 18 08 prefetcht0 (%rax)
0f 84 d9 00 00 00 je ...
4c 3b a3 b8 01 cmp
crash is in
void tcp_xmit_retransmit_queue(struct sock *sk)
{
<< HERE >> tcp_for_write_queue_from(skb, sk) {
}
Some skb in sk_write_queue has a NULL ->next pointer
Strange thing is R14 and RAX =ffff8807e7420678 (&sk->sk_write_queue)
R14 is the stable value during the loop, while RAW is scratch register.
I dont have full disassembly for this function, but I guess we just entered the loop
(or RAX should be really different at this point)
So, maybe list head itself is corrupted (sk->sk_write_queue->next = NULL)
or, retransmit_skb_hint problem ? (we forget to set it to NULL in some cases ?)
^ permalink raw reply
* Re: [PATCH] Multicast packet reassembly can fail
From: Eric Dumazet @ 2009-10-29 5:31 UTC (permalink / raw)
To: David Miller; +Cc: schen, netdev
In-Reply-To: <20091028.215738.66603083.davem@davemloft.net>
David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 28 Oct 2009 11:18:24 +0100
>
>> Check line 219 of net/ipv4/inet_fragment.c
>>
>> #ifdef CONFIG_SMP
>> /* With SMP race we have to recheck hash table, because
>> * such entry could be created on other cpu, while we
>> * promoted read lock to write lock.
>> */
>> hlist_for_each_entry(qp, n, &f->hash[hash], list) {
>> if (qp->net == nf && f->match(qp, arg)) {
>> atomic_inc(&qp->refcnt);
>> write_unlock(&f->lock);
>> qp_in->last_in |= INET_FRAG_COMPLETE; <<< HERE >>>
>> inet_frag_put(qp_in, f);
>> return qp;
>> }
>> }
>> #endif
>>
>> I really wonder why we set INET_FRAG_COMPLETE here
>
> What has happened here is that another cpu created an identical
> frag entry before we took the write lock.
>
> So we're letting that other cpu's entry stand, and will release
> our local one and not use it at all.
>
> Setting INET_FRAG_COMPLETE does two things:
>
> 1) It makes sure input frag processing skips this entry if such
> code paths happen to see it for some reason.
>
> 2) INET_FRAG_COMPLETE must be set when inet_frag_destroy() gets
> called by inet_frag_put() when it drops the refcount to zero.
> There is an assertion on INET_FRAG_COMPLETE in inet_frag_destroy.
>
> Hope that clears things up.
Yes thanks David, this is clear now.
^ permalink raw reply
* Re: [PATCH] vmxnet3: remove duplicate #include
From: David Miller @ 2009-10-29 5:29 UTC (permalink / raw)
To: sbhatewara; +Cc: netdev, weiyi.huang, pv-drivers
In-Reply-To: <alpine.LRH.2.00.0910280910250.24555@sbhatewara-dev1.eng.vmware.com>
From: Shreyas Bhatewara <sbhatewara@vmware.com>
Date: Wed, 28 Oct 2009 09:30:40 -0700 (PDT)
>
> Remove duplicate headerfile includes from vmxnet3_int.h
>
> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
> Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
> Signed-off-by: Bhavesh Davda <davda@vmware.com>
Applied.
^ permalink raw reply
* Re: [PATCH] bonding: fix a race condition in calls to slave MII ioctls
From: David Miller @ 2009-10-29 5:25 UTC (permalink / raw)
To: jbohac; +Cc: fubar, netdev
In-Reply-To: <20091021130301.GA4762@midget.suse.cz>
From: Jiri Bohac <jbohac@suse.cz>
Date: Wed, 21 Oct 2009 15:03:01 +0200
> Hi,
>
> In mii monitor mode, bond_check_dev_link() calls the the ioctl
> handler of slave devices. It stores the ndo_do_ioctl function
> pointer to a static (!) ioctl variable and later uses it to call the
> handler with the IOCTL macro.
>
> If another thread executes bond_check_dev_link() at the same time
> (even with a different bond, which none of the locks prevent), a
> race condition occurs. If the two racing slaves have different
> drivers, this may result in one driver's ioctl handler being
> called with a pointer to a net_device controlled with a different
> driver, resulting in unpredictable breakage.
>
> Unless I am overlooking something, the "static" must be a
> copy'n'paste error (?).
>
>
> Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cur and paste... from where? If you look at the 2.6.14-->2.6.14.1
commit in the history-2.6 tree (5db5272c) this static was there from
the moment the link status checking got added to the bonding driver
in Linus's tree.
Nevertheless indeed it is an awful bug, patch applied, thanks!
^ permalink raw reply
* Re: [net-next-2.6 PATCH 4/4] vlan: Add support to netdev_ops.ndo_fcoe_get_wwn for VLAN device
From: Joe Eykholt @ 2009-10-29 5:00 UTC (permalink / raw)
To: Jeff Kirsher; +Cc: davem, netdev, gospo, linux-scsi, Yi Zou
In-Reply-To: <20091029042515.15957.86107.stgit@localhost.localdomain>
Jeff Kirsher wrote:
> From: Yi Zou <yi.zou@intel.com>
>
> Implements the netdev_ops.ndo_fcoe_get_wwn for VLAN device.
How would this arrange for different VLANs to get different WWPNs?
Or does it allow FCoE only on one VLAN per port?
I guess that would be fair because some switches support only one FCoE VLAN.
Regards,
Joe
>
> Signed-off-by: Yi Zou <yi.zou@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>
> net/8021q/vlan_dev.c | 13 +++++++++++++
> 1 files changed, 13 insertions(+), 0 deletions(-)
>
> diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
> index e370197..790fd55 100644
> --- a/net/8021q/vlan_dev.c
> +++ b/net/8021q/vlan_dev.c
> @@ -626,6 +626,17 @@ static int vlan_dev_fcoe_disable(struct net_device *dev)
> rc = ops->ndo_fcoe_disable(real_dev);
> return rc;
> }
> +
> +static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
> +{
> + struct net_device *real_dev = vlan_dev_info(dev)->real_dev;
> + const struct net_device_ops *ops = real_dev->netdev_ops;
> + int rc = -EINVAL;
> +
> + if (ops->ndo_fcoe_get_wwn)
> + rc = ops->ndo_fcoe_get_wwn(real_dev, wwn, type);
> + return rc;
> +}
> #endif
>
> static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
> @@ -791,6 +802,7 @@ static const struct net_device_ops vlan_netdev_ops = {
> .ndo_fcoe_ddp_done = vlan_dev_fcoe_ddp_done,
> .ndo_fcoe_enable = vlan_dev_fcoe_enable,
> .ndo_fcoe_disable = vlan_dev_fcoe_disable,
> + .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
> #endif
> };
>
> @@ -813,6 +825,7 @@ static const struct net_device_ops vlan_netdev_accel_ops = {
> .ndo_fcoe_ddp_done = vlan_dev_fcoe_ddp_done,
> .ndo_fcoe_enable = vlan_dev_fcoe_enable,
> .ndo_fcoe_disable = vlan_dev_fcoe_disable,
> + .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
> #endif
> };
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] Multicast packet reassembly can fail
From: David Miller @ 2009-10-29 4:57 UTC (permalink / raw)
To: eric.dumazet; +Cc: schen, netdev
In-Reply-To: <4AE81A70.5060307@gmail.com>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 28 Oct 2009 11:18:24 +0100
> Check line 219 of net/ipv4/inet_fragment.c
>
> #ifdef CONFIG_SMP
> /* With SMP race we have to recheck hash table, because
> * such entry could be created on other cpu, while we
> * promoted read lock to write lock.
> */
> hlist_for_each_entry(qp, n, &f->hash[hash], list) {
> if (qp->net == nf && f->match(qp, arg)) {
> atomic_inc(&qp->refcnt);
> write_unlock(&f->lock);
> qp_in->last_in |= INET_FRAG_COMPLETE; <<< HERE >>>
> inet_frag_put(qp_in, f);
> return qp;
> }
> }
> #endif
>
> I really wonder why we set INET_FRAG_COMPLETE here
What has happened here is that another cpu created an identical
frag entry before we took the write lock.
So we're letting that other cpu's entry stand, and will release
our local one and not use it at all.
Setting INET_FRAG_COMPLETE does two things:
1) It makes sure input frag processing skips this entry if such
code paths happen to see it for some reason.
2) INET_FRAG_COMPLETE must be set when inet_frag_destroy() gets
called by inet_frag_put() when it drops the refcount to zero.
There is an assertion on INET_FRAG_COMPLETE in inet_frag_destroy.
Hope that clears things up.
^ permalink raw reply
* [net-next-2.6 PATCH] e1000e: flow control doesn't re-enable
From: Jeff Kirsher @ 2009-10-29 4:28 UTC (permalink / raw)
To: davem; +Cc: netdev, gospo, Bruce Allan, Jeff Kirsher
From: Bruce Allan <bruce.w.allan@intel.com>
When changing flow control (pause) parameters, the flow control thresholds
(i.e. when to send XON/XOFF frames) may not be setup correctly on parts
with copper media. Call the existing e1000_set_fc_watermarks()
function to set these thresholds.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/e1000e/ethtool.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/e1000e/ethtool.c b/drivers/net/e1000e/ethtool.c
index a70999b..0364b91 100644
--- a/drivers/net/e1000e/ethtool.c
+++ b/drivers/net/e1000e/ethtool.c
@@ -335,10 +335,18 @@ static int e1000_set_pauseparam(struct net_device *netdev,
hw->fc.current_mode = hw->fc.requested_mode;
- retval = ((hw->phy.media_type == e1000_media_type_fiber) ?
- hw->mac.ops.setup_link(hw) : e1000e_force_mac_fc(hw));
+ if (hw->phy.media_type == e1000_media_type_fiber) {
+ retval = hw->mac.ops.setup_link(hw);
+ /* implicit goto out */
+ } else {
+ retval = e1000e_force_mac_fc(hw);
+ if (retval)
+ goto out;
+ e1000e_set_fc_watermarks(hw);
+ }
}
+out:
clear_bit(__E1000_RESETTING, &adapter->state);
return retval;
}
^ permalink raw reply related
* [net-next-2.6 PATCH 4/4] vlan: Add support to netdev_ops.ndo_fcoe_get_wwn for VLAN device
From: Jeff Kirsher @ 2009-10-29 4:25 UTC (permalink / raw)
To: davem; +Cc: netdev, gospo, linux-scsi, Yi Zou, Jeff Kirsher
In-Reply-To: <20091029042339.15957.37676.stgit@localhost.localdomain>
From: Yi Zou <yi.zou@intel.com>
Implements the netdev_ops.ndo_fcoe_get_wwn for VLAN device.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
net/8021q/vlan_dev.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index e370197..790fd55 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -626,6 +626,17 @@ static int vlan_dev_fcoe_disable(struct net_device *dev)
rc = ops->ndo_fcoe_disable(real_dev);
return rc;
}
+
+static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
+{
+ struct net_device *real_dev = vlan_dev_info(dev)->real_dev;
+ const struct net_device_ops *ops = real_dev->netdev_ops;
+ int rc = -EINVAL;
+
+ if (ops->ndo_fcoe_get_wwn)
+ rc = ops->ndo_fcoe_get_wwn(real_dev, wwn, type);
+ return rc;
+}
#endif
static void vlan_dev_change_rx_flags(struct net_device *dev, int change)
@@ -791,6 +802,7 @@ static const struct net_device_ops vlan_netdev_ops = {
.ndo_fcoe_ddp_done = vlan_dev_fcoe_ddp_done,
.ndo_fcoe_enable = vlan_dev_fcoe_enable,
.ndo_fcoe_disable = vlan_dev_fcoe_disable,
+ .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
#endif
};
@@ -813,6 +825,7 @@ static const struct net_device_ops vlan_netdev_accel_ops = {
.ndo_fcoe_ddp_done = vlan_dev_fcoe_ddp_done,
.ndo_fcoe_enable = vlan_dev_fcoe_enable,
.ndo_fcoe_disable = vlan_dev_fcoe_disable,
+ .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
#endif
};
^ permalink raw reply related
* [net-next-2.6 PATCH 3/4] ixgbe: Add support for netdev_ops.ndo_fcoe_get_wwn to 82599
From: Jeff Kirsher @ 2009-10-29 4:24 UTC (permalink / raw)
To: davem; +Cc: netdev, gospo, linux-scsi, Yi Zou, Jeff Kirsher
In-Reply-To: <20091029042339.15957.37676.stgit@localhost.localdomain>
From: Yi Zou <yi.zou@intel.com>
Implements the netdev_ops.ndo_fcoe_get_wwn in 82599 if it finds valid
prefix for the World Wide Node Name (WWNN) or World Wide Port Name (WWPN),
as well as valid SAN MAC address.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ixgbe/ixgbe.h | 1 +
drivers/net/ixgbe/ixgbe_fcoe.c | 46 ++++++++++++++++++++++++++++++++++++++++
drivers/net/ixgbe/ixgbe_main.c | 1 +
3 files changed, 48 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 2b85416..7eb08a6 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -457,6 +457,7 @@ extern int ixgbe_fcoe_disable(struct net_device *netdev);
extern u8 ixgbe_fcoe_getapp(struct ixgbe_adapter *adapter);
extern u8 ixgbe_fcoe_setapp(struct ixgbe_adapter *adapter, u8 up);
#endif /* CONFIG_IXGBE_DCB */
+extern int ixgbe_fcoe_get_wwn(struct net_device *netdev, u64 *wwn, int type);
#endif /* IXGBE_FCOE */
#endif /* _IXGBE_H_ */
diff --git a/drivers/net/ixgbe/ixgbe_fcoe.c b/drivers/net/ixgbe/ixgbe_fcoe.c
index a3c9f99..edecdc8 100644
--- a/drivers/net/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ixgbe/ixgbe_fcoe.c
@@ -718,3 +718,49 @@ u8 ixgbe_fcoe_setapp(struct ixgbe_adapter *adapter, u8 up)
return 1;
}
#endif /* CONFIG_IXGBE_DCB */
+
+/**
+ * ixgbe_fcoe_get_wwn - get world wide name for the node or the port
+ * @netdev : ixgbe adapter
+ * @wwn : the world wide name
+ * @type: the type of world wide name
+ *
+ * Returns the node or port world wide name if both the prefix and the san
+ * mac address are valid, then the wwn is formed based on the NAA-2 for
+ * IEEE Extended name identifier (ref. to T10 FC-LS Spec., Sec. 15.3).
+ *
+ * Returns : 0 on success
+ */
+int ixgbe_fcoe_get_wwn(struct net_device *netdev, u64 *wwn, int type)
+{
+ int rc = -EINVAL;
+ u16 prefix = 0xffff;
+ struct ixgbe_adapter *adapter = netdev_priv(netdev);
+ struct ixgbe_mac_info *mac = &adapter->hw.mac;
+
+ switch (type) {
+ case NETDEV_FCOE_WWNN:
+ prefix = mac->wwnn_prefix;
+ break;
+ case NETDEV_FCOE_WWPN:
+ prefix = mac->wwpn_prefix;
+ break;
+ default:
+ break;
+ }
+
+ if ((prefix != 0xffff) &&
+ is_valid_ether_addr(mac->san_addr)) {
+ *wwn = ((u64) prefix << 48) |
+ ((u64) mac->san_addr[0] << 40) |
+ ((u64) mac->san_addr[1] << 32) |
+ ((u64) mac->san_addr[2] << 24) |
+ ((u64) mac->san_addr[3] << 16) |
+ ((u64) mac->san_addr[4] << 8) |
+ ((u64) mac->san_addr[5]);
+ rc = 0;
+ }
+ return rc;
+}
+
+
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 4c8a449..45c5faf 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5449,6 +5449,7 @@ static const struct net_device_ops ixgbe_netdev_ops = {
.ndo_fcoe_ddp_done = ixgbe_fcoe_ddp_put,
.ndo_fcoe_enable = ixgbe_fcoe_enable,
.ndo_fcoe_disable = ixgbe_fcoe_disable,
+ .ndo_fcoe_get_wwn = ixgbe_fcoe_get_wwn,
#endif /* IXGBE_FCOE */
};
^ permalink raw reply related
* [net-next-2.6 PATCH 1/4] ixgbe: Add support for 82599 alternative WWNN/WWPN prefix
From: Jeff Kirsher @ 2009-10-29 4:23 UTC (permalink / raw)
To: davem
Cc: netdev, gospo, linux-scsi, Yi Zou, Peter P Waskiewicz Jr,
Jeff Kirsher
From: Yi Zou <yi.zou@intel.com>
The 82599 EEPROM supports alternative prefix for World Wide Node Name
(WWNN) and World Wide Port Name (WWPN). The prefixes can be used together
with the SAN MAC address to form the WWNN and WWPN, which can be used by
upper layer drivers such as Fiber Channel over Ethernet (FCoE).
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ixgbe/ixgbe_82599.c | 50 +++++++++++++++++++++++++++++++++++++++
drivers/net/ixgbe/ixgbe_type.h | 15 ++++++++++++
2 files changed, 65 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index ae27c41..7210689 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -1000,6 +1000,10 @@ static s32 ixgbe_reset_hw_82599(struct ixgbe_hw *hw)
hw->mac.num_rar_entries--;
}
+ /* Store the alternative WWNN/WWPN prefix */
+ hw->mac.ops.get_wwn_prefix(hw, &hw->mac.wwnn_prefix,
+ &hw->mac.wwpn_prefix);
+
reset_hw_out:
return status;
}
@@ -2536,6 +2540,51 @@ fw_version_out:
return status;
}
+/**
+ * ixgbe_get_wwn_prefix_82599 - Get alternative WWNN/WWPN prefix from
+ * the EEPROM
+ * @hw: pointer to hardware structure
+ * @wwnn_prefix: the alternative WWNN prefix
+ * @wwpn_prefix: the alternative WWPN prefix
+ *
+ * This function will read the EEPROM from the alternative SAN MAC address
+ * block to check the support for the alternative WWNN/WWPN prefix support.
+ **/
+static s32 ixgbe_get_wwn_prefix_82599(struct ixgbe_hw *hw, u16 *wwnn_prefix,
+ u16 *wwpn_prefix)
+{
+ u16 offset, caps;
+ u16 alt_san_mac_blk_offset;
+
+ /* clear output first */
+ *wwnn_prefix = 0xFFFF;
+ *wwpn_prefix = 0xFFFF;
+
+ /* check if alternative SAN MAC is supported */
+ hw->eeprom.ops.read(hw, IXGBE_ALT_SAN_MAC_ADDR_BLK_PTR,
+ &alt_san_mac_blk_offset);
+
+ if ((alt_san_mac_blk_offset == 0) ||
+ (alt_san_mac_blk_offset == 0xFFFF))
+ goto wwn_prefix_out;
+
+ /* check capability in alternative san mac address block */
+ offset = alt_san_mac_blk_offset + IXGBE_ALT_SAN_MAC_ADDR_CAPS_OFFSET;
+ hw->eeprom.ops.read(hw, offset, &caps);
+ if (!(caps & IXGBE_ALT_SAN_MAC_ADDR_CAPS_ALTWWN))
+ goto wwn_prefix_out;
+
+ /* get the corresponding prefix for WWNN/WWPN */
+ offset = alt_san_mac_blk_offset + IXGBE_ALT_SAN_MAC_ADDR_WWNN_OFFSET;
+ hw->eeprom.ops.read(hw, offset, wwnn_prefix);
+
+ offset = alt_san_mac_blk_offset + IXGBE_ALT_SAN_MAC_ADDR_WWPN_OFFSET;
+ hw->eeprom.ops.read(hw, offset, wwpn_prefix);
+
+wwn_prefix_out:
+ return 0;
+}
+
static struct ixgbe_mac_operations mac_ops_82599 = {
.init_hw = &ixgbe_init_hw_generic,
.reset_hw = &ixgbe_reset_hw_82599,
@@ -2547,6 +2596,7 @@ static struct ixgbe_mac_operations mac_ops_82599 = {
.get_mac_addr = &ixgbe_get_mac_addr_generic,
.get_san_mac_addr = &ixgbe_get_san_mac_addr_82599,
.get_device_caps = &ixgbe_get_device_caps_82599,
+ .get_wwn_prefix = &ixgbe_get_wwn_prefix_82599,
.stop_adapter = &ixgbe_stop_adapter_generic,
.get_bus_info = &ixgbe_get_bus_info_generic,
.set_lan_id = &ixgbe_set_lan_id_multi_port_pcie,
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index 1cab53e..21b6633 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -1539,6 +1539,16 @@
#define IXGBE_FW_PASSTHROUGH_PATCH_CONFIG_PTR 0x4
#define IXGBE_FW_PATCH_VERSION_4 0x7
+/* Alternative SAN MAC Address Block */
+#define IXGBE_ALT_SAN_MAC_ADDR_BLK_PTR 0x27 /* Alt. SAN MAC block */
+#define IXGBE_ALT_SAN_MAC_ADDR_CAPS_OFFSET 0x0 /* Alt. SAN MAC capability */
+#define IXGBE_ALT_SAN_MAC_ADDR_PORT0_OFFSET 0x1 /* Alt. SAN MAC 0 offset */
+#define IXGBE_ALT_SAN_MAC_ADDR_PORT1_OFFSET 0x4 /* Alt. SAN MAC 1 offset */
+#define IXGBE_ALT_SAN_MAC_ADDR_WWNN_OFFSET 0x7 /* Alt. WWNN prefix offset */
+#define IXGBE_ALT_SAN_MAC_ADDR_WWPN_OFFSET 0x8 /* Alt. WWPN prefix offset */
+#define IXGBE_ALT_SAN_MAC_ADDR_CAPS_SANMAC 0x0 /* Alt. SAN MAC exists */
+#define IXGBE_ALT_SAN_MAC_ADDR_CAPS_ALTWWN 0x1 /* Alt. WWN base exists */
+
/* PCI Bus Info */
#define IXGBE_PCI_LINK_STATUS 0xB2
#define IXGBE_PCI_DEVICE_CONTROL2 0xC8
@@ -2345,6 +2355,7 @@ struct ixgbe_mac_operations {
s32 (*get_mac_addr)(struct ixgbe_hw *, u8 *);
s32 (*get_san_mac_addr)(struct ixgbe_hw *, u8 *);
s32 (*get_device_caps)(struct ixgbe_hw *, u16 *);
+ s32 (*get_wwn_prefix)(struct ixgbe_hw *, u16 *, u16 *);
s32 (*stop_adapter)(struct ixgbe_hw *);
s32 (*get_bus_info)(struct ixgbe_hw *);
void (*set_lan_id)(struct ixgbe_hw *);
@@ -2416,6 +2427,10 @@ struct ixgbe_mac_info {
u8 addr[IXGBE_ETH_LENGTH_OF_ADDRESS];
u8 perm_addr[IXGBE_ETH_LENGTH_OF_ADDRESS];
u8 san_addr[IXGBE_ETH_LENGTH_OF_ADDRESS];
+ /* prefix for World Wide Node Name (WWNN) */
+ u16 wwnn_prefix;
+ /* prefix for World Wide Port Name (WWPN) */
+ u16 wwpn_prefix;
s32 mc_filter_type;
u32 mcft_size;
u32 vft_size;
^ permalink raw reply related
* [net-next-2.6 PATCH 2/4] net: Add ndo_fcoe_get_wwn to net_device_ops
From: Jeff Kirsher @ 2009-10-29 4:24 UTC (permalink / raw)
To: davem; +Cc: netdev, gospo, linux-scsi, Yi Zou, Jeff Kirsher
In-Reply-To: <20091029042339.15957.37676.stgit@localhost.localdomain>
From: Yi Zou <yi.zou@intel.com>
Add ndo_fcoe_get_wwn so Fiber Channel over Ethernet (FCoE) can make use of
the provided World Wide Port Name (WWPN) and World Wide Node Name (WWNN)
from the underlying network interface driver.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
include/linux/netdevice.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e7c227d..656110a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -635,6 +635,10 @@ struct net_device_ops {
unsigned int sgc);
int (*ndo_fcoe_ddp_done)(struct net_device *dev,
u16 xid);
+#define NETDEV_FCOE_WWNN 0
+#define NETDEV_FCOE_WWPN 1
+ int (*ndo_fcoe_get_wwn)(struct net_device *dev,
+ u64 *wwn, int type);
#endif
};
^ permalink raw reply related
* Re: [PATCH 2/2] tc35815: Enable NAPI
From: Atsushi Nemoto @ 2009-10-29 4:14 UTC (permalink / raw)
To: davem; +Cc: netdev, ralf.roesch
In-Reply-To: <20091028.035743.76929614.davem@davemloft.net>
On Wed, 28 Oct 2009 03:57:43 -0700 (PDT), David Miller <davem@davemloft.net> wrote:
> Please remove the NAPI enabling macro and the tests for it.
> NAPI support should be unconditional.
>
> If people want to test the pre-NAPI behavior, they can check
> out an older copy of the driver quite easily.
OK, I will do it. Thank you.
---
Atsushi Nemoto
^ permalink raw reply
* Re: wanPMC-CxT1E1
From: Greg KH @ 2009-10-29 2:54 UTC (permalink / raw)
To: Bob Beers; +Cc: netdev
In-Reply-To: <4f6ba3b0910281843p73ef31bdm54b4640c21dbffbc@mail.gmail.com>
On Wed, Oct 28, 2009 at 09:43:44PM -0400, Bob Beers wrote:
> On Wed, Oct 28, 2009 at 9:05 PM, Greg KH <greg@kroah.com> wrote:
> > On Tue, Oct 27, 2009 at 01:48:53PM -0400, Bob Beers wrote:
> >> On Mon, Oct 26, 2009 at 4:41 PM, Greg KH <greg@kroah.com> wrote:
> >> > Getting it to build on 2.6.31 is more important than RHEL5, we can't do
> >> > anything with an old kernel like that.
> >>
> >> ok, so where do I start, I have a system ready to start
> >> ?git cloning, and creating patches. I googled for a while
> >> ?but didn't find a nice recipe for participating in the -staging
> >> ?process.
> >
> > Ick, this isn't going to be easy, a lot of work needs to be done on the
> > driver to get it just to build on the latest kernel tree. ?I personally
> > don't have the time to do it right now, but will gladly accept patches
> > that add it to the staging tree if someone else wants to do it.
> >
> > sorry,
>
> I've started down the path. Is 'successfully compiles' the only requirement
> for the first patch?
Yup, as long as it builds I'm happy :)
thanks,
greg k-h
^ permalink raw reply
* Re: wanPMC-CxT1E1
From: Bob Beers @ 2009-10-29 1:43 UTC (permalink / raw)
To: Greg KH; +Cc: netdev
In-Reply-To: <20091029010535.GA18723@kroah.com>
On Wed, Oct 28, 2009 at 9:05 PM, Greg KH <greg@kroah.com> wrote:
> On Tue, Oct 27, 2009 at 01:48:53PM -0400, Bob Beers wrote:
>> On Mon, Oct 26, 2009 at 4:41 PM, Greg KH <greg@kroah.com> wrote:
>> > Getting it to build on 2.6.31 is more important than RHEL5, we can't do
>> > anything with an old kernel like that.
>>
>> ok, so where do I start, I have a system ready to start
>> git cloning, and creating patches. I googled for a while
>> but didn't find a nice recipe for participating in the -staging
>> process.
>
> Ick, this isn't going to be easy, a lot of work needs to be done on the
> driver to get it just to build on the latest kernel tree. I personally
> don't have the time to do it right now, but will gladly accept patches
> that add it to the staging tree if someone else wants to do it.
>
> sorry,
I've started down the path. Is 'successfully compiles' the only requirement
for the first patch?
thanks,
--
-Bob Beers
^ permalink raw reply
* Re: wanPMC-CxT1E1
From: Greg KH @ 2009-10-29 1:05 UTC (permalink / raw)
To: Bob Beers; +Cc: netdev
In-Reply-To: <4f6ba3b0910271048n10ff37fek9af191b133892e1e@mail.gmail.com>
On Tue, Oct 27, 2009 at 01:48:53PM -0400, Bob Beers wrote:
> On Mon, Oct 26, 2009 at 4:41 PM, Greg KH <greg@kroah.com> wrote:
> > Getting it to build on 2.6.31 is more important than RHEL5, we can't do
> > anything with an old kernel like that.
>
> ok, so where do I start, I have a system ready to start
> git cloning, and creating patches. I googled for a while
> but didn't find a nice recipe for participating in the -staging
> process.
Ick, this isn't going to be easy, a lot of work needs to be done on the
driver to get it just to build on the latest kernel tree. I personally
don't have the time to do it right now, but will gladly accept patches
that add it to the staging tree if someone else wants to do it.
sorry,
greg k-h
^ permalink raw reply
* Re: [PATCH] hso: fix debug routines
From: Andrew Morton @ 2009-10-28 23:16 UTC (permalink / raw)
To: Antti Kaijanmäki; +Cc: Greg KH, linux-kernel, netdev, Jan Dumon
In-Reply-To: <1256653615.3591.113.camel@nomovok.homedomain>
On Tue, 27 Oct 2009 16:26:55 +0200
Antti Kaijanmäki <antti.kaijanmaki@nomovok.com> wrote:
> On Mon, 2009-10-26 at 12:40 -0700, Greg KH wrote:
> > Yes, that should be a new patch, especially as it would not be needed
> > to fix older kernels for the original bug.
> >
> > So, care to send 2 patches? The debug one isn't needed to be sent to
> > the stable@kernel.org address.
>
>
> Signed-off-by: Antti Kaijanmäki <antti.kaijanmaki@nomovok.com>
> ---
> drivers/net/usb/hso.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
> index fa4e581..746839b 100644
> --- a/drivers/net/usb/hso.c
> +++ b/drivers/net/usb/hso.c
> @@ -378,7 +378,7 @@ static void dbg_dump(int line_count, const char *func_name, unsigned char *buf,
> }
>
> #define DUMP(buf_, len_) \
> - dbg_dump(__LINE__, __func__, buf_, len_)
> + dbg_dump(__LINE__, __func__, (unsigned char *)buf_, len_)
>
> #define DUMP1(buf_, len_) \
> do { \
> @@ -1527,7 +1527,7 @@ static void tiocmget_intr_callback(struct urb *urb)
> dev_warn(&usb->dev,
> "hso received invalid serial state notification\n");
> DUMP(serial_state_notification,
> - sizeof(hso_serial_state_notifation))
> + sizeof(struct hso_serial_state_notification));
> } else {
>
> UART_state_bitmap = le16_to_cpu(serial_state_notification->
This patch has no changelog, and I'm not seeing any description of what
it fixes and how it fixes it up-thread.
^ permalink raw reply
* Congratulation!!!
From: Mr. Parris Williams @ 2009-10-28 22:49 UTC (permalink / raw)
To: info
£1,350,000.00 Pounds has been awarded to your E-mail do send us your Name Age Tel Address contery.
^ permalink raw reply
* Congratulation!!!
From: Mr. Parris Williams @ 2009-10-28 22:48 UTC (permalink / raw)
To: info
£1,350,000.00 Pounds has been awarded to your E-mail do send us your Name Age Tel Address contery.
^ permalink raw reply
* Congratulation!!!
From: Mr. Parris Williams @ 2009-10-28 22:48 UTC (permalink / raw)
To: info
£1,350,000.00 Pounds has been awarded to your E-mail do send us your Name Age Tel Address contery.
^ permalink raw reply
* Re: Fw: [Bug 14470] New: freez in TCP stack
From: Denys Fedoryschenko @ 2009-10-28 22:27 UTC (permalink / raw)
To: netdev; +Cc: Stephen Hemminger
In-Reply-To: <20091028151313.ba4a4d23.akpm@linux-foundation.org>
>
> Twice a day on six separate machines. That ain't no hardware glitch.
>
> Vaclav, are you able to say whether this is a regression? Did those
> machines run 2.6.30 (for example)?
>
> Thanks.
I had issues on Dell also. On one fixed by bios update, another only after
tuning some voodoo settings in sysctl (i was in hurry, no redundancy for this
server, and it was rebooting each day 1-3 times). It happens also in 32 and
64bit kernels (32bit userspace), also "heavy" tcp workload, both of them act
as proxy.
But my issue probably different, on both Dell servers i had bnx2 with IPMI.
It was very weird, nmi_watchdog, panic on reboot / on oops, detect
softlockups, detect deadlocks, detect hang tasks, hangcheck timer - didn't
help, only hardware watchdog (IPMI or iTCO) able to catch hang and reboot
server. Because i didn't had anything useful to report(remote server and
netconsole didn't give anything), i didn't fill bugzilla report.
Not sure my post useful in this case, but sharing experience anyway.
>
> > We had to put serial console on these servers to catch the oops. Is there
> > anything else We can do to debug this?
> > The RIP is always the same:
> >
> > RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
> > tcp_xmit_retransmit_queue+0x8c/0x290
> >
> > rest of the oops always differs a litle ... here is an example:
> >
> > RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
> > tcp_xmit_retransmit_queue+0x8c/0x290
> > RSP: 0018:ffffc90000003a40 EFLAGS: 00010246
> > RAX: ffff8807e7420678 RBX: ffff8807e74205c0 RCX: 0000000000000000
> > RDX: 000000004598a105 RSI: 0000000000000000 RDI: ffff8807e74205c0
> > RBP: ffffc90000003a80 R08: 0000000000000003 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> > R13: ffff8807e74205c0 R14: ffff8807e7420678 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffffc90000000000(0000)
> > knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process swapper (pid: 0, threadinfo ffffffff81608000, task
> > ffffffff81631440) Stack:
> > ffffc90000003a60 0000000000000000 4598a105e74205c0 000000004598a101
> > <0> 000000000000050e ffff8807e74205c0 0000000000000003 0000000000000000
> > <0> ffffc90000003b40 ffffffff8141ae4a ffff8807e7420678 0000000000000000
> > Call Trace:
> > <IRQ>
> > [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
> > [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
> > [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
> > [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
> > [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
> > [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
> > [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
> > [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
> > [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
> > [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
> > [<ffffffff8140701f>] ip_rcv+0x24f/0x350
> > [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
> > [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
> > [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
> > [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
> > [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
> > [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
> > [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
> > [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
> > [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
> > [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
> > [<ffffffff8100c27c>] call_softirq+0x1c/0x30
> > [<ffffffff8100e04d>] do_softirq+0x3d/0x80
> > [<ffffffff81041b0b>] irq_exit+0x7b/0x90
> > [<ffffffff8100d613>] do_IRQ+0x73/0xe0
> > [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
> > <EOI>
> > [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
> > [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
> > [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
> > [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
> > [<ffffffff81468db6>] ? rest_init+0x66/0x70
> > [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
> > [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
> > [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
> > Code: 00 eb 28 8b 83 d0 03 00 00 41 39 44 24 40 0f 89 00 01 00 00 41 0f
> > b6 cd 41 bd 2f 00 00 00 83 e1 03 0f 84 fc 00 00 00 4d 8b 24 24 <49> 8b 04
> > 24 4d 39 f4 0f 18 08 0f 84 d9 00 00 00 4c 3b a3 b8 01
> > RIP [<ffffffff814203cc>] tcp_xmit_retransmit_queue+0x8c/0x290
> > RSP <ffffc90000003a40>
> > CR2: 0000000000000000
> > ---[ end trace d97d99c9ae1d52cc ]---
> > Kernel panic - not syncing: Fatal exception in interrupt
> > Pid: 0, comm: swapper Tainted: G D 2.6.31 #2
> > Call Trace:
> > <IRQ> [<ffffffff8103cab0>] panic+0xa0/0x170
> > [<ffffffff8100bb13>] ? ret_from_intr+0x0/0xa
> > [<ffffffff8103c74e>] ? print_oops_end_marker+0x1e/0x20
> > [<ffffffff8100f38e>] oops_end+0x9e/0xb0
> > [<ffffffff81025b9a>] no_context+0x15a/0x250
> > [<ffffffff81025e2b>] __bad_area_nosemaphore+0xdb/0x1c0
> > [<ffffffff813e89e9>] ? dev_hard_start_xmit+0x269/0x2f0
> > [<ffffffff81025fae>] bad_area_nosemaphore+0xe/0x10
> > [<ffffffff8102639f>] do_page_fault+0x17f/0x260
> > [<ffffffff8147eadf>] page_fault+0x1f/0x30
> > [<ffffffff814203cc>] ? tcp_xmit_retransmit_queue+0x8c/0x290
> > [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
> > [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
> > [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
> > [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
> > [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
> > [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
> > [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
> > [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
> > [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
> > [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
> > [<ffffffff8140701f>] ip_rcv+0x24f/0x350
> > [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
> > [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
> > [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
> > [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
> > [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
> > [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
> > [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
> > [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
> > [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
> > [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
> > [<ffffffff8100c27c>] call_softirq+0x1c/0x30
> > [<ffffffff8100e04d>] do_softirq+0x3d/0x80
> > [<ffffffff81041b0b>] irq_exit+0x7b/0x90
> > [<ffffffff8100d613>] do_IRQ+0x73/0xe0
> > [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
> > <EOI> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
> > [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
> > [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
> > [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
> > [<ffffffff81468db6>] ? rest_init+0x66/0x70
> > [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
> > [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
> > [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Fw: [Bug 14470] New: freez in TCP stack
From: Andrew Morton @ 2009-10-28 22:13 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, kolo, bugzilla-daemon
In-Reply-To: <20091026084132.57bc3d07@nehalam>
On Mon, 26 Oct 2009 08:41:32 -0700
Stephen Hemminger <shemminger@linux-foundation.org> wrote:
>
>
> Begin forwarded message:
>
> Date: Mon, 26 Oct 2009 12:47:22 GMT
> From: bugzilla-daemon@bugzilla.kernel.org
> To: shemminger@linux-foundation.org
> Subject: [Bug 14470] New: freez in TCP stack
>
Stephen, please retain the bugzilla and reporter email cc's when
forwarding a report to a mailing list.
> http://bugzilla.kernel.org/show_bug.cgi?id=14470
>
> Summary: freez in TCP stack
> Product: Networking
> Version: 2.5
> Kernel Version: 2.6.31
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: IPV4
> AssignedTo: shemminger@linux-foundation.org
> ReportedBy: kolo@albatani.cz
> Regression: No
>
>
> We are hiting kernel panics on Dell R610 servers with e1000e NICs; it apears
> usualy under a high network trafic ( around 100Mbit/s) but it is not a rule it
> has happened even on low trafic.
>
> Servers are used as reverse http proxy (varnish).
>
> On 6 equal servers this panic happens aprox 2 times a day depending on network
> load. Machine completly freezes till the management watchdog reboots.
>
Twice a day on six separate machines. That ain't no hardware glitch.
Vaclav, are you able to say whether this is a regression? Did those
machines run 2.6.30 (for example)?
Thanks.
> We had to put serial console on these servers to catch the oops. Is there
> anything else We can do to debug this?
> The RIP is always the same:
>
> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
> tcp_xmit_retransmit_queue+0x8c/0x290
>
> rest of the oops always differs a litle ... here is an example:
>
> RIP: 0010:[<ffffffff814203cc>] [<ffffffff814203cc>]
> tcp_xmit_retransmit_queue+0x8c/0x290
> RSP: 0018:ffffc90000003a40 EFLAGS: 00010246
> RAX: ffff8807e7420678 RBX: ffff8807e74205c0 RCX: 0000000000000000
> RDX: 000000004598a105 RSI: 0000000000000000 RDI: ffff8807e74205c0
> RBP: ffffc90000003a80 R08: 0000000000000003 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff8807e74205c0 R14: ffff8807e7420678 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff81608000, task ffffffff81631440)
> Stack:
> ffffc90000003a60 0000000000000000 4598a105e74205c0 000000004598a101
> <0> 000000000000050e ffff8807e74205c0 0000000000000003 0000000000000000
> <0> ffffc90000003b40 ffffffff8141ae4a ffff8807e7420678 0000000000000000
> Call Trace:
> <IRQ>
> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
> <EOI>
> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
> [<ffffffff81468db6>] ? rest_init+0x66/0x70
> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
> Code: 00 eb 28 8b 83 d0 03 00 00 41 39 44 24 40 0f 89 00 01 00 00 41 0f b6 cd
> 41 bd 2f 00 00 00 83 e1 03 0f 84 fc 00 00 00 4d 8b 24 24 <49> 8b 04 24 4d 39 f4
> 0f 18 08 0f 84 d9 00 00 00 4c 3b a3 b8 01
> RIP [<ffffffff814203cc>] tcp_xmit_retransmit_queue+0x8c/0x290
> RSP <ffffc90000003a40>
> CR2: 0000000000000000
> ---[ end trace d97d99c9ae1d52cc ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Pid: 0, comm: swapper Tainted: G D 2.6.31 #2
> Call Trace:
> <IRQ> [<ffffffff8103cab0>] panic+0xa0/0x170
> [<ffffffff8100bb13>] ? ret_from_intr+0x0/0xa
> [<ffffffff8103c74e>] ? print_oops_end_marker+0x1e/0x20
> [<ffffffff8100f38e>] oops_end+0x9e/0xb0
> [<ffffffff81025b9a>] no_context+0x15a/0x250
> [<ffffffff81025e2b>] __bad_area_nosemaphore+0xdb/0x1c0
> [<ffffffff813e89e9>] ? dev_hard_start_xmit+0x269/0x2f0
> [<ffffffff81025fae>] bad_area_nosemaphore+0xe/0x10
> [<ffffffff8102639f>] do_page_fault+0x17f/0x260
> [<ffffffff8147eadf>] page_fault+0x1f/0x30
> [<ffffffff814203cc>] ? tcp_xmit_retransmit_queue+0x8c/0x290
> [<ffffffff8141ae4a>] tcp_ack+0x170a/0x1dd0
> [<ffffffff8141c362>] tcp_rcv_state_process+0x122/0xab0
> [<ffffffff81422c6c>] tcp_v4_do_rcv+0xac/0x220
> [<ffffffff813fd02f>] ? nf_iterate+0x5f/0x90
> [<ffffffff81424b26>] tcp_v4_rcv+0x586/0x6b0
> [<ffffffff813fd0c5>] ? nf_hook_slow+0x65/0xf0
> [<ffffffff81406b70>] ? ip_local_deliver_finish+0x0/0x120
> [<ffffffff81406bcf>] ip_local_deliver_finish+0x5f/0x120
> [<ffffffff8140715b>] ip_local_deliver+0x3b/0x90
> [<ffffffff81406971>] ip_rcv_finish+0x141/0x340
> [<ffffffff8140701f>] ip_rcv+0x24f/0x350
> [<ffffffff813e7ced>] netif_receive_skb+0x20d/0x2f0
> [<ffffffff813e7e90>] napi_skb_finish+0x40/0x50
> [<ffffffff813e82f4>] napi_gro_receive+0x34/0x40
> [<ffffffff8133e0c8>] e1000_receive_skb+0x48/0x60
> [<ffffffff81342342>] e1000_clean_rx_irq+0xf2/0x330
> [<ffffffff813410a1>] e1000_clean+0x81/0x2a0
> [<ffffffff81054ce1>] ? ktime_get+0x11/0x50
> [<ffffffff813eaf1c>] net_rx_action+0x9c/0x130
> [<ffffffff81046940>] ? get_next_timer_interrupt+0x1d0/0x210
> [<ffffffff81041bd7>] __do_softirq+0xb7/0x160
> [<ffffffff8100c27c>] call_softirq+0x1c/0x30
> [<ffffffff8100e04d>] do_softirq+0x3d/0x80
> [<ffffffff81041b0b>] irq_exit+0x7b/0x90
> [<ffffffff8100d613>] do_IRQ+0x73/0xe0
> [<ffffffff8100bb13>] ret_from_intr+0x0/0xa
> <EOI> [<ffffffff81296e6c>] ? acpi_idle_enter_bm+0x245/0x271
> [<ffffffff81296e62>] ? acpi_idle_enter_bm+0x23b/0x271
> [<ffffffff813c7a08>] ? cpuidle_idle_call+0x98/0xf0
> [<ffffffff8100a104>] ? cpu_idle+0x94/0xd0
> [<ffffffff81468db6>] ? rest_init+0x66/0x70
> [<ffffffff816a082f>] ? start_kernel+0x2ef/0x340
> [<ffffffff8169fd54>] ? x86_64_start_reservations+0x84/0x90
> [<ffffffff8169fe32>] ? x86_64_start_kernel+0xd2/0x100
>
^ permalink raw reply
* Re: pull request: wireless-next-2.6 2009-10-28
From: Bartlomiej Zolnierkiewicz @ 2009-10-28 21:56 UTC (permalink / raw)
To: John W. Linville; +Cc: davem, linux-wireless, netdev, linux-kernel
In-Reply-To: <20091028211031.GF2856@tuxdriver.com>
On Wednesday 28 October 2009 22:10:32 John W. Linville wrote:
> Dave,
>
> I let my patches pile-up! Yikes!!
>
> This request includes the usual ton of stuff for -next -- driver
> updates, fixes for some earlier -next stuff, a few cfg80211 changes to
> accomodate the libertas driver, etc. Of note is the rt2800pci support
> added to the rt2x00 family.
Unfortunately rt2800pci support is non-functioning at the moment... :(
> Pleaset let me know if there are problems!
I find it rather disappointing that all my review comments regarding
rt2800pci support were just completely ignored and then the initial
patch was merged just as it was..
The way rt2800usb and rt2800pci drivers are designed really results
in making the task of adding working support for RT28x0 and RT30x0
chipsets to rt2x00 infrastructure more difficult and time consuming
than it should be... :(
--
Bartlomiej Zolnierkiewicz
^ permalink raw reply
* dev_flags definitions in broadcom.c
From: Matt Carlson @ 2009-10-28 21:25 UTC (permalink / raw)
To: Nate Case; +Cc: Maciej W. Rozycki, Jeff Garzik, netdev@vger.kernel.org
Nate, On May 17, 2008 you submitted a patch titled
"PHYLIB: Add 1000Base-X support for Broadcom bcm5482". In that patch
you defined several dev_flags definitions for the broadcom module. I
only see the PHY_BCM_FLAGS_MODE_1000BX being used in the code though. I
quickly scanned through the phy_connect calls in drivers/net and didn't
see any caller using this preprocessor definition or an equivalent
hardcoded constant. Perhaps I missed something though. Can you tell me
where these flags are set?
My interest in this is several-fold. First, I'd like to move the
definitions to include/linux/brcmphy.h so that the same preprocessor
definitons can be used at both ends. But more important than that, I'd
like to get a handle on how many of these definitions are actually used
and how many are just placeholders. With only 32-bits to use, the flags
might become a precious resource. I have plans for a few of these bits
myself, and I'd like reviewers to take note of how they are used.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox