From: Tony Battersby <tonyb@cybernetics.com>
To: Stephen Hemminger <shemminger@linux-foundation.org>,
netdev@vger.kernel.org
Subject: sky2: tx hang on dual-port Yukon XL when rx csum disabled
Date: Mon, 28 Jan 2008 13:43:19 -0500 [thread overview]
Message-ID: <479E2247.8080109@cybernetics.com> (raw)
I am experiencing network tx hangs on a dual-port SK-9E22 with sky2 in
2.6.24. The problem is triggered by both ports transmitting at high
speed simultaneously. This problem is 100% quickly reproducible. Here
is the setup:
PC #1 with Intel PRO/1000 NIC:
e1000 IP address 192.168.1.1
running iperf -s
PC #2 with Intel PRO/1000 NIC:
e1000 IP address 192.168.2.1
running iperf -s
PC #3 with SysKonnect SK-9E22 (dual-port copper PCI-express)
sky2 IP address 192.168.1.2
sky2 IP address 192.168.2.2
So basically, I have two PCs with Intel PRO/1000 NICs running "iperf
-s". Each of these Intel NICs is directly cabled to one of the two
ports of the SysKonnect NIC.
When I run:
(PC #3 tty1) iperf -c 192.168.1.1 -t 30
(wait for a second or two)
(PC #3 tty2) iperf -c 192.168.2.1 -t 30
"iperf -c 192.168.1.1" never finishes, but "iperf -c 192.168.2.1" does
finish. Press Ctrl-C to abort the hung iperf. Ping 192.168.1.1 does
not respond. Ping 192.168.2.1 does respond, but each ping has almost
exactly 1 second latency (the latency should be < 1 ms).
When I switch the order of the tests, whichever iperf -c was started
_first_ is the one that locks up with no ping afterward, and whichever
was started _second_ is the one that finishes, but with a 1-second ping
latency afterward. So the problem follows the ordering of the tests
rather than a specific port.
Also, the trigger seems to be transmitting, not receiving. If I run
"iperf -s" on the SysKonnect PC and "iperf -c" on the two Intel PRO/1000
PCs, then the tests pass.
When I do "ethtool -K eth0 rx on; ethtool -K eth1 rx on" to turn on rx
checksumming on both ports of the SysKonnect NIC, both tests pass
successfully. Commit 8b31cfbcd1b54362ef06c85beb40e65a349169a2 "sky2:
disable rx checksum on Yukon XL" disabled rx checksumming by default on
this NIC to get rid of some "hw csum failure" messages
(http://marc.info/?l=linux-netdev&m=119497815523843&w=4). However, this
seems to have exposed a different (and arguably worse) bug.
I also tried booting with "maxcpus=1 pci=nomsi", but that didn't affect
the problem.
As a temporary workaround, I will use ethtool to turn on rx checksumming
and live with the "hw csum failure" messages, since they are better than
network lockups.
Let me know if I can be of any further assistance in tracking down this
problem.
Tony Battersby
Cybernetics
next reply other threads:[~2008-01-28 19:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-28 18:43 Tony Battersby [this message]
2008-01-28 20:43 ` sky2: tx hang on dual-port Yukon XL when rx csum disabled Stephen Hemminger
2008-01-28 20:58 ` Tony Battersby
2008-01-28 21:21 ` Brandeburg, Jesse
2008-01-28 21:38 ` Tony Battersby
2008-01-29 15:26 ` Tony Battersby
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479E2247.8080109@cybernetics.com \
--to=tonyb@cybernetics.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).