From: Ingo Molnar <mingo@elte.hu>
To: David Miller <davem@davemloft.net>
Cc: vgusev@openvz.org, e1000-devel@lists.sourceforge.net,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
"Rafael J. Wysocki" <rjw@sisk.pl>,
mcmanus@ducksong.com, ilpo.jarvinen@helsinki.fi,
kuznet@ms2.inr.ac.ru, xemul@openvz.org
Subject: Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets
Date: Tue, 17 Jun 2008 10:32:20 +0200 [thread overview]
Message-ID: <20080617083220.GA11393@elte.hu> (raw)
In-Reply-To: <20080617080958.GC12535@elte.hu>
* Ingo Molnar <mingo@elte.hu> wrote:
>
> > FWIW I don't think your TX timeout problem has anything to do with
> > packet ordering. The TX element of the network device is totally
> > stateless, but it's hanging under some set of circumstances to the
> > point where we timeout and reset the hardware to get it going again.
>
> ok. That's e1000 then. Cc:s added. Stock T60 laptop, 32-bit:
>
> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
> Subsystem: Lenovo ThinkPad T60
> Flags: bus master, fast devsel, latency 0, IRQ 16
> Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
> I/O ports at 2000 [size=32]
> Capabilities: <access denied>
> Kernel driver in use: e1000
>
> the problem is this non-fatal warning showing up after bootup,
> sporadically, in a non-reproducible way:
>
> [ 173.354049] NETDEV WATCHDOG: eth0: transmit timed out
> [ 173.354148] ------------[ cut here ]------------
> [ 173.354221] WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0x9a/0xec()
> [ 173.354298] Modules linked in:
> [ 173.354421] Pid: 13452, comm: cc1 Tainted: G W 2.6.26-rc6-00273-g81ae43a-dirty #2573
> [ 173.354516] [<c01250ca>] warn_on_slowpath+0x46/0x76
> [ 173.354641] [<c011d428>] ? try_to_wake_up+0x1d6/0x1e0
> [ 173.354815] [<c01411e9>] ? trace_hardirqs_off+0xb/0xd
> [ 173.357370] [<c011d43d>] ? default_wake_function+0xb/0xd
> [ 173.357370] [<c014112a>] ? trace_hardirqs_off_caller+0x15/0xc9
> [ 173.357370] [<c01411e9>] ? trace_hardirqs_off+0xb/0xd
> [ 173.357370] [<c0142c83>] ? trace_hardirqs_on+0xb/0xd
> [ 173.357370] [<c0142b33>] ? trace_hardirqs_on_caller+0x16/0x15b
> [ 173.357370] [<c0142c83>] ? trace_hardirqs_on+0xb/0xd
> [ 173.357370] [<c06bb3c9>] ? _spin_unlock_irqrestore+0x5b/0x71
> [ 173.357370] [<c0133d46>] ? __queue_work+0x2d/0x32
> [ 173.357370] [<c0134023>] ? queue_work+0x50/0x72
> [ 173.357483] [<c0134059>] ? schedule_work+0x14/0x16
> [ 173.357654] [<c05c59b8>] dev_watchdog+0x9a/0xec
> [ 173.357783] [<c012d456>] run_timer_softirq+0x13d/0x19d
> [ 173.357905] [<c05c591e>] ? dev_watchdog+0x0/0xec
> [ 173.358073] [<c05c591e>] ? dev_watchdog+0x0/0xec
> [ 173.360804] [<c0129ad7>] __do_softirq+0xb2/0x15c
> [ 173.360804] [<c0129a25>] ? __do_softirq+0x0/0x15c
> [ 173.360804] [<c0105526>] do_softirq+0x84/0xe9
> [ 173.360804] [<c0129996>] irq_exit+0x4b/0x88
> [ 173.360804] [<c010ec7a>] smp_apic_timer_interrupt+0x73/0x81
> [ 173.360804] [<c0103ddd>] apic_timer_interrupt+0x2d/0x34
> [ 173.360804] =======================
> [ 173.360804] ---[ end trace a7919e7f17c0a725 ]---
>
> full report can be found at:
>
> http://lkml.org/lkml/2008/6/13/224
>
> i have 3 other test-systems with e1000 (with a similar CPU) which are
> _not_ showing this symptom, so this could be some model-specific e1000
> issue.
btw., this reminds me that this is the same system that has a serious
e1000 network latency bug which i have reported more than a year ago,
but which still does not appear to be fixed in latest mainline:
PING europe (10.0.1.15) 56(84) bytes of data.
64 bytes from europe (10.0.1.15): icmp_seq=1 ttl=64 time=1.51 ms
64 bytes from europe (10.0.1.15): icmp_seq=2 ttl=64 time=404 ms
64 bytes from europe (10.0.1.15): icmp_seq=3 ttl=64 time=487 ms
64 bytes from europe (10.0.1.15): icmp_seq=4 ttl=64 time=296 ms
64 bytes from europe (10.0.1.15): icmp_seq=5 ttl=64 time=305 ms
64 bytes from europe (10.0.1.15): icmp_seq=6 ttl=64 time=1011 ms
64 bytes from europe (10.0.1.15): icmp_seq=7 ttl=64 time=0.209 ms
64 bytes from europe (10.0.1.15): icmp_seq=8 ttl=64 time=763 ms
64 bytes from europe (10.0.1.15): icmp_seq=9 ttl=64 time=1000 ms
64 bytes from europe (10.0.1.15): icmp_seq=10 ttl=64 time=0.438 ms
64 bytes from europe (10.0.1.15): icmp_seq=11 ttl=64 time=1000 ms
64 bytes from europe (10.0.1.15): icmp_seq=12 ttl=64 time=0.299 ms
^C
--- europe ping statistics ---
12 packets transmitted, 12 received, 0% packet loss, time 11085ms
those up to 1000 msec delays can be 'felt' via ssh too, if this problem
triggers then the system is almost unusable via the network. Local
latencies are perfect so it's an e1000 problem.
Ingo
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
next prev parent reply other threads:[~2008-06-17 8:32 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-11 12:58 [TCP]: TCP_DEFER_ACCEPT causes leak sockets Vitaliy Gusev
2008-06-11 13:57 ` Alexey Kuznetsov
2008-06-11 23:52 ` David Miller
2008-06-12 23:32 ` David Miller
2008-06-13 6:30 ` Ingo Molnar
2008-06-13 9:32 ` David Miller
2008-06-13 11:09 ` Ingo Molnar
2008-06-13 11:47 ` Ingo Molnar
2008-06-13 21:10 ` Ingo Molnar
2008-06-16 23:59 ` David Miller
2008-06-17 7:26 ` Ingo Molnar
2008-06-17 7:38 ` David Miller
2008-06-17 8:09 ` Ingo Molnar
2008-06-17 8:32 ` Ingo Molnar [this message]
2008-06-17 9:08 ` David Miller
2008-06-17 9:27 ` Ingo Molnar
2008-06-17 9:29 ` David Miller
2008-06-17 9:39 ` Ingo Molnar
2008-06-18 18:50 ` [E1000-devel] " Kok, Auke
2008-06-18 20:08 ` Ingo Molnar
2008-06-18 21:25 ` [E1000-devel] " Kok, Auke
2008-06-18 22:12 ` David Miller
2008-06-19 7:06 ` Jarek Poplawski
2008-06-18 21:32 ` Ingo Molnar
2008-06-18 21:41 ` Denys Fedoryshchenko
2008-06-18 22:05 ` Ingo Molnar
2008-06-18 22:44 ` Denys Fedoryshchenko
2008-06-18 23:14 ` Ingo Molnar
2008-06-17 8:43 ` Vitaliy Gusev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080617083220.GA11393@elte.hu \
--to=mingo@elte.hu \
--cc=davem@davemloft.net \
--cc=e1000-devel@lists.sourceforge.net \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=mcmanus@ducksong.com \
--cc=netdev@vger.kernel.org \
--cc=rjw@sisk.pl \
--cc=vgusev@openvz.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).