* TCP out of memory - possible bug [3.18.0-rc3] / sched?
@ 2014-11-03 18:59 Tomasz Mloduchowski
2014-11-03 19:41 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Mloduchowski @ 2014-11-03 18:59 UTC (permalink / raw)
To: netdev
Hi List,
I hope this is the right place to report a networking issue with
3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
quickly exhausts tcp_mem limits in a very strange sequence of events.
It might be scheduler or networking subsystem related.
It's 100% reproducible on my system, first observed under 3.18.0-rc2.
Neither terminating the offending application, nor removing the network
card seems to bring the 'mem' item in /proc/net/sockstat down from it's
extreme values.
http://static.qdot.me/tcp_mem_issue.png contains the plot of the 'mem'
and 'sockets' fields from sockstat - violet is the 'mem', quickly
exhausting the default 512k pages limit after a short period of reliable
operation.
Best Regards,
Tomasz
-- snip --
sched: RT throttling activated
kworker/dying (792) used greatest stack depth: 11984 bytes left
TCP: out of memory -- consider tuning tcp_mem
TCP: out of memory -- consider tuning tcp_mem
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/core/stream.c:201
sk_stream_kill_queues+0x12c/0x130()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Not tainted 3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
0000000000000009 ffff8800cae2fd88 ffffffff81b6c0ff 0000000000000000
0000000000000000 ffff8800cae2fdc8 ffffffff810c235c ffff8800cae2fda8
ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
[<ffffffff81b6c0ff>] dump_stack+0x46/0x58
[<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
[<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
[<ffffffff81954fbc>] sk_stream_kill_queues+0x12c/0x130
[<ffffffff819b8b45>] inet_csk_destroy_sock+0x55/0x140
[<ffffffff819bd9ee>] tcp_close+0x22e/0x430
[<ffffffff819e3b82>] inet_release+0x72/0x80
[<ffffffff819455fa>] sock_release+0x1a/0x90
[<ffffffff8194567d>] sock_close+0xd/0x20
[<ffffffff811dfdf6>] __fput+0xc6/0x1d0
[<ffffffff811dff49>] ____fput+0x9/0x10
[<ffffffff810db94f>] task_work_run+0x8f/0xd0
[<ffffffff81046b12>] do_notify_resume+0x82/0xa0
[<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2571 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5802 at net/ipv4/af_inet.c:153
inet_sock_destruct+0x1d9/0x1e0()
Modules linked in:
CPU: 0 PID: 5802 Comm: main Tainted: G W 3.18.0-rc3 #2
Hardware name: LENOVO 37023L0/37023L0, BIOS GFET48WW (1.27 ) 07/01/2014
0000000000000009 ffff8800cae2fd68 ffffffff81b6c0ff 0000000000000000
0000000000000000 ffff8800cae2fda8 ffffffff810c235c ffff8800cae2fd88
ffff8800bd996580 ffff8800bd996700 0000000000000004 ffff8800bd996610
Call Trace:
[<ffffffff81b6c0ff>] dump_stack+0x46/0x58
[<ffffffff810c235c>] warn_slowpath_common+0x7c/0xa0
[<ffffffff810c2425>] warn_slowpath_null+0x15/0x20
[<ffffffff819e5179>] inet_sock_destruct+0x1d9/0x1e0
[<ffffffff81949e7e>] __sk_free+0x1e/0x100
[<ffffffff81949f79>] sk_free+0x19/0x20
[<ffffffff819bd918>] tcp_close+0x158/0x430
[<ffffffff819e3b82>] inet_release+0x72/0x80
[<ffffffff819455fa>] sock_release+0x1a/0x90
[<ffffffff8194567d>] sock_close+0xd/0x20
[<ffffffff811dfdf6>] __fput+0xc6/0x1d0
[<ffffffff811dff49>] ____fput+0x9/0x10
[<ffffffff810db94f>] task_work_run+0x8f/0xd0
[<ffffffff81046b12>] do_notify_resume+0x82/0xa0
[<ffffffff81b7613f>] int_signal+0x12/0x17
---[ end trace 080b1124407d2572 ]---
kworker/dying (679) used greatest stack depth: 11784 bytes left
TCP: out of memory -- consider tuning tcp_mem
-- snip --
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
2014-11-03 18:59 TCP out of memory - possible bug [3.18.0-rc3] / sched? Tomasz Mloduchowski
@ 2014-11-03 19:41 ` Eric Dumazet
2015-01-04 8:42 ` Markus Trippelsdorf
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2014-11-03 19:41 UTC (permalink / raw)
To: Tomasz Mloduchowski; +Cc: netdev
On Mon, 2014-11-03 at 19:59 +0100, Tomasz Mloduchowski wrote:
> Hi List,
>
> I hope this is the right place to report a networking issue with
> 3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
> rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
> quickly exhausts tcp_mem limits in a very strange sequence of events.
>
> It might be scheduler or networking subsystem related.
>
> It's 100% reproducible on my system, first observed under 3.18.0-rc2.
>
Sounds a perfect case for a bisection maybe ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
2014-11-03 19:41 ` Eric Dumazet
@ 2015-01-04 8:42 ` Markus Trippelsdorf
2015-01-04 21:01 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Markus Trippelsdorf @ 2015-01-04 8:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Tomasz Mloduchowski, netdev
On 2014.11.03 at 11:41 -0800, Eric Dumazet wrote:
> On Mon, 2014-11-03 at 19:59 +0100, Tomasz Mloduchowski wrote:
> > Hi List,
> >
> > I hope this is the right place to report a networking issue with
> > 3.18.0-rc2 and 3.18.0-rc3 - under heavy P2P load (tested both
> > rtorrent/libtorrent and bitcoind, so not protocol-specific), the system
> > quickly exhausts tcp_mem limits in a very strange sequence of events.
> >
> > It might be scheduler or networking subsystem related.
> >
> > It's 100% reproducible on my system, first observed under 3.18.0-rc2.
> >
>
> Sounds a perfect case for a bisection maybe ?
Any news on this issue?
I stumbled across the same problem today with 3.19.0-rc2:
Jan 4 09:23:22 x4 kernel: TCP: out of memory -- consider tuning tcp_mem
Jan 4 09:23:23 x4 kernel: ------------[ cut here ]------------
Jan 4 09:23:23 x4 kernel: WARNING: CPU: 3 PID: 32504 at net/core/stream.c:201 inet_csk_destroy_sock+0x4d/0x100()
Jan 4 09:23:23 x4 kernel: CPU: 3 PID: 32504 Comm: main Not tainted 3.19.0-rc2-00193-g5e0f872c7d7e-dirty #52
Jan 4 09:23:23 x4 kernel: Hardware name: System manufacturer System Product Name/M4A78T-E, BIOS 3503 04/13/2011
Jan 4 09:23:23 x4 kernel: 0000000000000000 ffffffff819b8544 ffffffff81686ffc 0000000000000000
Jan 4 09:23:23 x4 kernel: ffffffff8106b352 ffff88008a623a80 ffff88008a623be8 0000000000000004
Jan 4 09:23:23 x4 kernel: ffff88008a623ae8 ffff88012d3f9240 ffffffff8162fead ffff88008a623a80
Jan 4 09:23:23 x4 kernel: Call Trace:
Jan 4 09:23:23 x4 kernel: [<ffffffff81686ffc>] ? dump_stack+0x40/0x50
Jan 4 09:23:23 x4 kernel: [<ffffffff8106b352>] ? warn_slowpath_common+0x72/0xc0
Jan 4 09:23:23 x4 kernel: [<ffffffff8162fead>] ? inet_csk_destroy_sock+0x4d/0x100
Jan 4 09:23:23 x4 kernel: [<ffffffff81633746>] ? tcp_close+0x226/0x400
Jan 4 09:23:23 x4 kernel: [<ffffffff816582cd>] ? inet_release+0x6d/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff815cf090>] ? sock_release+0x10/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff815cf10d>] ? sock_close+0xd/0x20
Jan 4 09:23:23 x4 kernel: [<ffffffff8110eeca>] ? __fput+0xca/0x1e0
Jan 4 09:23:23 x4 kernel: [<ffffffff81081c0f>] ? task_work_run+0xaf/0x100
Jan 4 09:23:23 x4 kernel: [<ffffffff81689734>] ? __schedule+0x254/0x760
Jan 4 09:23:23 x4 kernel: [<ffffffff81039ae1>] ? do_notify_resume+0x61/0xa0
Jan 4 09:23:23 x4 kernel: [<ffffffff8110f33e>] ? fput+0x3e/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff8168dc18>] ? int_signal+0x12/0x17
Jan 4 09:23:23 x4 kernel: ---[ end trace ccb3dc614f5600b1 ]---
Jan 4 09:23:23 x4 kernel: ------------[ cut here ]------------
Jan 4 09:23:23 x4 kernel: WARNING: CPU: 3 PID: 32504 at net/ipv4/af_inet.c:153 inet_sock_destruct+0x169/0x1e0()
Jan 4 09:23:23 x4 kernel: CPU: 3 PID: 32504 Comm: main Tainted: G W 3.19.0-rc2-00193-g5e0f872c7d7e-dirty #52
Jan 4 09:23:23 x4 kernel: Hardware name: System manufacturer System Product Name/M4A78T-E, BIOS 3503 04/13/2011
Jan 4 09:23:23 x4 kernel: 0000000000000000 ffffffff819b9876 ffffffff81686ffc 0000000000000000
Jan 4 09:23:23 x4 kernel: ffffffff8106b352 ffff88008a623a80 ffff88008a623bc8 ffff880100104a30
Jan 4 09:23:23 x4 kernel: ffff880216864660 ffff88012d3f9240 ffffffff816591e9 ffff8800d5e2a810
Jan 4 09:23:23 x4 kernel: Call Trace:
Jan 4 09:23:23 x4 kernel: [<ffffffff81686ffc>] ? dump_stack+0x40/0x50
Jan 4 09:23:23 x4 kernel: [<ffffffff8106b352>] ? warn_slowpath_common+0x72/0xc0
Jan 4 09:23:23 x4 kernel: [<ffffffff816591e9>] ? inet_sock_destruct+0x169/0x1e0
Jan 4 09:23:23 x4 kernel: [<ffffffff815d1792>] ? __sk_free+0x12/0xc0
Jan 4 09:23:23 x4 kernel: [<ffffffff816582cd>] ? inet_release+0x6d/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff815cf090>] ? sock_release+0x10/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff815cf10d>] ? sock_close+0xd/0x20
Jan 4 09:23:23 x4 kernel: [<ffffffff8110eeca>] ? __fput+0xca/0x1e0
Jan 4 09:23:23 x4 kernel: [<ffffffff81081c0f>] ? task_work_run+0xaf/0x100
Jan 4 09:23:23 x4 kernel: [<ffffffff81689734>] ? __schedule+0x254/0x760
Jan 4 09:23:23 x4 kernel: [<ffffffff81039ae1>] ? do_notify_resume+0x61/0xa0
Jan 4 09:23:23 x4 kernel: [<ffffffff8110f33e>] ? fput+0x3e/0x80
Jan 4 09:23:23 x4 kernel: [<ffffffff8168dc18>] ? int_signal+0x12/0x17
Jan 4 09:23:23 x4 kernel: ---[ end trace ccb3dc614f5600b2 ]---
rtorrent was spinning and the network was unusable until reboot.
--
Markus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
2015-01-04 8:42 ` Markus Trippelsdorf
@ 2015-01-04 21:01 ` Eric Dumazet
2015-01-04 21:39 ` Markus Trippelsdorf
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2015-01-04 21:01 UTC (permalink / raw)
To: Markus Trippelsdorf; +Cc: Tomasz Mloduchowski, netdev
On Sun, 2015-01-04 at 09:42 +0100, Markus Trippelsdorf wrote:
> Any news on this issue?
>
> I stumbled across the same problem today with 3.19.0-rc2:
What make you think it was fixed in 3.19.0-rc2 ?
Have you tried a bisection ?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: TCP out of memory - possible bug [3.18.0-rc3] / sched?
2015-01-04 21:01 ` Eric Dumazet
@ 2015-01-04 21:39 ` Markus Trippelsdorf
0 siblings, 0 replies; 5+ messages in thread
From: Markus Trippelsdorf @ 2015-01-04 21:39 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Tomasz Mloduchowski, netdev
On 2015.01.04 at 13:01 -0800, Eric Dumazet wrote:
> On Sun, 2015-01-04 at 09:42 +0100, Markus Trippelsdorf wrote:
>
> > Any news on this issue?
> >
> > I stumbled across the same problem today with 3.19.0-rc2:
>
> What make you think it was fixed in 3.19.0-rc2 ?
Nothing.
> Have you tried a bisection ?
It was a one time event for me. So a bisection is out of the question.
But Tomasz wrote that he can reproduce the issue reliably.
Have you tried a bisection, Tomasz?
--
Markus
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-01-04 21:39 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-03 18:59 TCP out of memory - possible bug [3.18.0-rc3] / sched? Tomasz Mloduchowski
2014-11-03 19:41 ` Eric Dumazet
2015-01-04 8:42 ` Markus Trippelsdorf
2015-01-04 21:01 ` Eric Dumazet
2015-01-04 21:39 ` Markus Trippelsdorf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).