From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
NetDev <netdev@vger.kernel.org>
Subject: 2.6.31.6: Intel 82574L devices spontaneously dropping off PCIe?
Date: Sun, 13 Dec 2009 12:37:58 -0800 [thread overview]
Message-ID: <4B2550A6.8030705@goop.org> (raw)
I have a Supermicro X8SIL-F system, which has a couple of on-board
82574L gigabit interfaces. I'm running the stock F12 kernel on it
(2.6.31.6-166.fc12.x86_64). This is a new machine, so I'm trying to
work out if this is a hardware problem I should RMA the board for, if
this is some kind of driver bug.
The interfaces come up and apparently work fine - for a while. But
after a bit of load (say, a ~9GB of incoming TCP traffic from another
machine on the same switch) the hardware appears to disappear from
PCIe. ifconfig starts showing junk:
eth1 Link encap:Ethernet HWaddr 00:30:48:DD:EB:67
inet6 addr: fe80::230:48ff:fedd:eb67/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:7910754 errors:532687613729670 dropped:88781268954945 overruns:0 frame:355125075819780
TX packets:4104172 errors:177562537909890 dropped:0 overruns:0 carrier:177562537909890
collisions:88781268954945 txqueuelen:1000
RX bytes:9589212936 (8.9 GiB) TX bytes:271851778 (259.2 MiB)
Memory:fafe0000-fb000000
and lspci shows that the config space is all 0xff:
[root@lilith ~]# lspci -s 04:00.0 -x
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection (rev ff)
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[root@lilith ~]# lspci -s 05:00.0 -x
05:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection (rev ff)
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
This seems to happen quietly without the kernel noticing; the only
side-effect is the dev watchdog triggering:
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0xf3/0x164() (Not tainted)
Hardware name: X8SIL
NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
Modules linked in: ip6table_filter ip6_tables bridge stp llc sunrpc xt_physdev ip6t_REJECT nf_conntrack_ipv6 ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_multipath kvm_intel kvm uinput snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event snd_seq_midi_emul snd_emu10k1 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd e1000e i2c_i801 soundcore emu10k1_gp gameport i2c_core joydev cryptd aes_x86_64 aes_generic xts gf128mul dm_crypt raid10 [last unloaded: ip6_tables]
Pid: 0, comm: swapper Not tainted 2.6.31.6-166.fc12.x86_64 #1
Call Trace:
<IRQ> [<ffffffff810516f4>] warn_slowpath_common+0x84/0x9c
[<ffffffff81051763>] warn_slowpath_fmt+0x41/0x43
[<ffffffff8138e831>] ? netif_tx_lock+0x44/0x6d
[<ffffffff8138e99b>] dev_watchdog+0xf3/0x164
[<ffffffff8105bc52>] ? internal_add_timer+0xcf/0xd1
[<ffffffff8105bd0b>] ? cascade+0x6a/0x84
[<ffffffff8105bec4>] run_timer_softirq+0x19f/0x21c
[<ffffffff8106ae47>] ? hrtimer_interrupt+0x13c/0x153
[<ffffffff81057614>] __do_softirq+0xdd/0x1ad
[<ffffffff81026936>] ? apic_write+0x16/0x18
[<ffffffff81012eac>] call_softirq+0x1c/0x30
[<ffffffff810143fb>] do_softirq+0x47/0x8d
[<ffffffff81057326>] irq_exit+0x44/0x86
[<ffffffff8141ecf5>] do_IRQ+0xa5/0xbc
[<ffffffff810126d3>] ret_from_intr+0x0/0x11
<EOI> [<ffffffff812679dd>] ? acpi_idle_enter_bm+0x281/0x2b5
[<ffffffff812679d6>] ? acpi_idle_enter_bm+0x27a/0x2b5
[<ffffffff81353b7f>] ? cpuidle_idle_call+0x99/0xce
[<ffffffff81010c60>] ? cpu_idle+0xa6/0xe9
[<ffffffff81405db7>] ? rest_init+0x6b/0x6d
[<ffffffff81714dc9>] ? start_kernel+0x3ef/0x3fa
[<ffffffff817142a1>] ? x86_64_start_reservations+0xac/0xb0
[<ffffffff8171439d>] ? x86_64_start_kernel+0xf8/0x107
---[ end trace f271bce88fe9d682 ]---
0000:05:00.0: eth1: Error reading PHY register
e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
A reboot seems to recover the devices:
[root@lilith ~]# lspci -s 04:00.0 -x
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
00: 86 80 d3 10 07 04 10 00 00 00 00 02 10 00 00 00
10: 00 00 ee fa 00 00 00 00 01 cc 00 00 00 c0 ed fa
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 05 06
30: 00 00 00 00 c8 00 00 00 00 00 00 00 0a 01 00 00
[root@lilith ~]# lspci -s 05:00.0 -x
05:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
00: 86 80 d3 10 07 04 10 00 00 00 00 02 10 00 00 00
10: 00 00 fe fa 00 00 00 00 01 dc 00 00 00 c0 fd fa
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 05 06
30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00
Any clues?
Thanks,
J
next reply other threads:[~2009-12-13 20:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-13 20:37 Jeremy Fitzhardinge [this message]
2009-12-15 20:16 ` 2.6.31.6 (e1000e): Intel 82574L devices spontaneously dropping off PCIe? Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B2550A6.8030705@goop.org \
--to=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.