* Fw: [Bug 195713] New: TCP recv queue grows huge
@ 2017-05-11 16:47 Stephen Hemminger
2017-05-11 17:06 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2017-05-11 16:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
Begin forwarded message:
Date: Thu, 11 May 2017 13:25:23 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 195713] New: TCP recv queue grows huge
https://bugzilla.kernel.org/show_bug.cgi?id=195713
Bug ID: 195713
Summary: TCP recv queue grows huge
Product: Networking
Version: 2.5
Kernel Version: 3.13.0 4.4.0 4.9.0
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: IPV4
Assignee: stephen@networkplumber.org
Reporter: mkm@nabto.com
Regression: No
I was testing how TCP handled advertising reductions of the window sizes
especially Window Full events. To create this setup I made a slow TCP receiver
and a fast TCP sender. To add some reality to the scenario I simulated 10ms
delay on the loopback device using the netem tc module.
Steps to reproduce:
Bevare these steps will use all the memory on your system
1. create latency on loopback
>sudo tc qdisc change dev lo root netem delay 0ms
2. slow tcp receiver:
>nc -l 4242 | pv -L 1k
3. fast tcp sender:
>nc 127.0.0.1 4242 < /dev/zero
What to expect:
It is expected that the TCP recv queue is not groving unbounded e.g. the
following output from netstat:
>netstat -an | grep 4242
>tcp 5563486 0 127.0.0.1:4242 127.0.0.1:59113
>ESTABLISHED
>tcp 0 3415559 127.0.0.1:59113 127.0.0.1:4242
>ESTABLISHED
What is seen:
The TCP receive queue grows until there is no more memory available on the
system.
>netstat -an | grep 4242
>tcp 223786525 0 127.0.0.1:4242 127.0.0.1:59114
>ESTABLISHED
>tcp 0 4191037 127.0.0.1:59114 127.0.0.1:4242
>ESTABLISHED
Note: After the TCP recv queue reaches ~ 2^31 bytes netstat reports a 0 which
is not correct, it has probably not been created with this bug in mind.
Systems on which the bug reproducible:
* debian testing, kernel 4.9.0
* ubuntu 14.04, kernel 3.13.0
* ubuntu 16.04, kernel 4.4.0
I have not testet on other systems than the above mentioned.
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Fw: [Bug 195713] New: TCP recv queue grows huge 2017-05-11 16:47 Fw: [Bug 195713] New: TCP recv queue grows huge Stephen Hemminger @ 2017-05-11 17:06 ` Eric Dumazet 2017-05-11 19:29 ` Michael Madsen 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-05-11 17:06 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Eric Dumazet, netdev, mkm On Thu, 2017-05-11 at 09:47 -0700, Stephen Hemminger wrote: > > Begin forwarded message: > > Date: Thu, 11 May 2017 13:25:23 +0000 > From: bugzilla-daemon@bugzilla.kernel.org > To: stephen@networkplumber.org > Subject: [Bug 195713] New: TCP recv queue grows huge > > > https://bugzilla.kernel.org/show_bug.cgi?id=195713 > > Bug ID: 195713 > Summary: TCP recv queue grows huge > Product: Networking > Version: 2.5 > Kernel Version: 3.13.0 4.4.0 4.9.0 > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV4 > Assignee: stephen@networkplumber.org > Reporter: mkm@nabto.com > Regression: No > > I was testing how TCP handled advertising reductions of the window sizes > especially Window Full events. To create this setup I made a slow TCP receiver > and a fast TCP sender. To add some reality to the scenario I simulated 10ms > delay on the loopback device using the netem tc module. > > Steps to reproduce: > Bevare these steps will use all the memory on your system > > 1. create latency on loopback > >sudo tc qdisc change dev lo root netem delay 0ms > > 2. slow tcp receiver: > >nc -l 4242 | pv -L 1k > > 3. fast tcp sender: > >nc 127.0.0.1 4242 < /dev/zero > > What to expect: > It is expected that the TCP recv queue is not groving unbounded e.g. the > following output from netstat: > > >netstat -an | grep 4242 > >tcp 5563486 0 127.0.0.1:4242 127.0.0.1:59113 > >ESTABLISHED > >tcp 0 3415559 127.0.0.1:59113 127.0.0.1:4242 > >ESTABLISHED > > What is seen: > > The TCP receive queue grows until there is no more memory available on the > system. > > >netstat -an | grep 4242 > >tcp 223786525 0 127.0.0.1:4242 127.0.0.1:59114 > >ESTABLISHED > >tcp 0 4191037 127.0.0.1:59114 127.0.0.1:4242 > >ESTABLISHED > > Note: After the TCP recv queue reaches ~ 2^31 bytes netstat reports a 0 which > is not correct, it has probably not been created with this bug in mind. > > Systems on which the bug reproducible: > > * debian testing, kernel 4.9.0 > * ubuntu 14.04, kernel 3.13.0 > * ubuntu 16.04, kernel 4.4.0 > > I have not testet on other systems than the above mentioned. > Not reproducible on my test machine. Somehow some sysctl must have been set to an insane value by mkm@nabto.com ? Please use/report ss -temoi instead of old netstat which does not provide info. lpaa23:~# tc -s -d qd sh dev lo qdisc netem 8002: root refcnt 2 limit 1000 Sent 1153017 bytes 388 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 lpaa23:~# ss -temoi dst :4242 or src :4242 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3255206 127.0.0.1:35672 127.0.0.1:4242 timer:(persist,15sec,0) ino:3740676 sk:1 <-> skmem:(r0,rb1060272,t0,tb4194304,f2650,w3319206,o0,bl0,d0) ts sack cubic wscale:8,8 rto:230 backoff:7 rtt:20.879/26.142 mss:65483 rcvmss:536 advmss:65483 cwnd:19 ssthresh:19 bytes_acked:3258385 segs_out:86 segs_in:50 data_segs_out:68 send 476.7Mbps lastsnd:43940 lastrcv:163390 lastack:13500 pacing_rate 572.0Mbps delivery_rate 11146.0Mbps busy:163390ms rwnd_limited:163380ms(100.0%) retrans:0/1 rcv_space:43690 notsent:3255206 minrtt:0.002 ESTAB 3022864 0 127.0.0.1:4242 127.0.0.1:35672 ino:3703653 sk:2 <-> skmem:(r3259664,rb3406910,t0,tb2626560,f752,w0,o0,bl0,d17) ts sack cubic wscale:8,8 rto:210 rtt:0.019/0.009 ato:120 mss:21888 rcvmss:65483 advmss:65483 cwnd:10 bytes_received:3258384 segs_out:49 segs_in:86 data_segs_in:68 send 92160.0Mbps lastsnd:163390 lastrcv:43940 lastack:43940 rcv_rtt:0.239 rcv_space:61440 minrtt:0.019 lpaa23:~# uname -a Linux lpaa23 4.11.0-smp-DEV #197 SMP @1494476384 x86_64 GNU/Linux ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: [Bug 195713] New: TCP recv queue grows huge 2017-05-11 17:06 ` Eric Dumazet @ 2017-05-11 19:29 ` Michael Madsen 2017-05-11 19:42 ` Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Michael Madsen @ 2017-05-11 19:29 UTC (permalink / raw) To: Eric Dumazet, Stephen Hemminger; +Cc: Eric Dumazet, netdev On 05/11/2017 07:06 PM, Eric Dumazet wrote: > On Thu, 2017-05-11 at 09:47 -0700, Stephen Hemminger wrote: >> Begin forwarded message: >> >> Date: Thu, 11 May 2017 13:25:23 +0000 >> From: bugzilla-daemon@bugzilla.kernel.org >> To: stephen@networkplumber.org >> Subject: [Bug 195713] New: TCP recv queue grows huge >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=195713 >> >> Bug ID: 195713 >> Summary: TCP recv queue grows huge >> Product: Networking >> Version: 2.5 >> Kernel Version: 3.13.0 4.4.0 4.9.0 >> Hardware: All >> OS: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: IPV4 >> Assignee: stephen@networkplumber.org >> Reporter: mkm@nabto.com >> Regression: No >> >> I was testing how TCP handled advertising reductions of the window sizes >> especially Window Full events. To create this setup I made a slow TCP receiver >> and a fast TCP sender. To add some reality to the scenario I simulated 10ms >> delay on the loopback device using the netem tc module. >> >> Steps to reproduce: >> Bevare these steps will use all the memory on your system >> >> 1. create latency on loopback >>> sudo tc qdisc change dev lo root netem delay 0ms >> 2. slow tcp receiver: >>> nc -l 4242 | pv -L 1k >> 3. fast tcp sender: >>> nc 127.0.0.1 4242 < /dev/zero >> What to expect: >> It is expected that the TCP recv queue is not groving unbounded e.g. the >> following output from netstat: >> >>> netstat -an | grep 4242 >>> tcp 5563486 0 127.0.0.1:4242 127.0.0.1:59113 >>> ESTABLISHED >>> tcp 0 3415559 127.0.0.1:59113 127.0.0.1:4242 >>> ESTABLISHED >> What is seen: >> >> The TCP receive queue grows until there is no more memory available on the >> system. >> >>> netstat -an | grep 4242 >>> tcp 223786525 0 127.0.0.1:4242 127.0.0.1:59114 >>> ESTABLISHED >>> tcp 0 4191037 127.0.0.1:59114 127.0.0.1:4242 >>> ESTABLISHED >> Note: After the TCP recv queue reaches ~ 2^31 bytes netstat reports a 0 which >> is not correct, it has probably not been created with this bug in mind. >> >> Systems on which the bug reproducible: >> >> * debian testing, kernel 4.9.0 >> * ubuntu 14.04, kernel 3.13.0 >> * ubuntu 16.04, kernel 4.4.0 >> >> I have not testet on other systems than the above mentioned. >> > > Not reproducible on my test machine. > > Somehow some sysctl must have been set to an insane value by > mkm@nabto.com ? > > Please use/report ss -temoi instead of old netstat which does not > provide info. > > lpaa23:~# tc -s -d qd sh dev lo > qdisc netem 8002: root refcnt 2 limit 1000 > Sent 1153017 bytes 388 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > lpaa23:~# ss -temoi dst :4242 or src :4242 > State Recv-Q Send-Q Local Address:Port Peer > Address:Port > ESTAB 0 3255206 127.0.0.1:35672 127.0.0.1:4242 > timer:(persist,15sec,0) ino:3740676 sk:1 <-> > skmem:(r0,rb1060272,t0,tb4194304,f2650,w3319206,o0,bl0,d0) ts sack > cubic wscale:8,8 rto:230 backoff:7 rtt:20.879/26.142 mss:65483 > rcvmss:536 advmss:65483 cwnd:19 ssthresh:19 bytes_acked:3258385 > segs_out:86 segs_in:50 data_segs_out:68 send 476.7Mbps lastsnd:43940 > lastrcv:163390 lastack:13500 pacing_rate 572.0Mbps delivery_rate > 11146.0Mbps busy:163390ms rwnd_limited:163380ms(100.0%) retrans:0/1 > rcv_space:43690 notsent:3255206 minrtt:0.002 > ESTAB 3022864 0 127.0.0.1:4242 127.0.0.1:35672 > ino:3703653 sk:2 <-> > skmem:(r3259664,rb3406910,t0,tb2626560,f752,w0,o0,bl0,d17) ts sack > cubic wscale:8,8 rto:210 rtt:0.019/0.009 ato:120 mss:21888 rcvmss:65483 > advmss:65483 cwnd:10 bytes_received:3258384 segs_out:49 segs_in:86 > data_segs_in:68 send 92160.0Mbps lastsnd:163390 lastrcv:43940 > lastack:43940 rcv_rtt:0.239 rcv_space:61440 minrtt:0.019 > > > lpaa23:~# uname -a > Linux lpaa23 4.11.0-smp-DEV #197 SMP @1494476384 x86_64 GNU/Linux > > > I've made an error in the bugreport, sorry, the tc step should set a nonzero delay e.g. tc qdisc change dev lo root netem delay 100ms tc -s -d qd sh dev lo qdisc netem 8001: root refcnt 2 limit 1000 delay 100.0ms Sent 2310729789 bytes 56051 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 netstat -an | grep 4242 tcp 1737737598 0 127.0.0.1:4242 127.0.0.1:47724 ESTABLISHED tcp 0 3734810 127.0.0.1:47724 127.0.0.1:4242 ESTABLISHED ss -temoi dst :4242 or src :4242 State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 1771226600 0 127.0.0.1:4242 127.0.0.1:47724 uid:1000 ino:248318 sk:21 <-> skmem:(r4292138050,rb5633129,t40,tb2626560,f3006,w0,o0,bl0,d0) ts sack cubic wscale:7,7 rto:600 rtt:200.15/100.075 ato:40 mss:21888 cwnd:10 bytes_received:1771576125 segs_out:13932 segs_in:27728 data_segs_in:27726 send 8.7Mbps lastsnd:132144 lastrcv:4 lastack:4852 pacing_rate 17.5Mbps rcv_rtt:202 rcv_space:188413 minrtt:200.15 ESTAB 0 3866200 127.0.0.1:47724 127.0.0.1:4242 timer:(on,372ms,0) uid:1000 ino:246613 sk:22 <-> skmem:(r0,rb1061808,t4,tb4194304,f267688,w3943000,o0,bl0,d0) ts sack cubic wscale:7,7 rto:404 rtt:200.112/0.058 mss:65483 cwnd:89 bytes_acked:1769019586 segs_out:27732 segs_in:13913 data_segs_out:27730 send 233.0Mbps lastsnd:32 lastrcv:26247708 lastack:32 pacing_rate 466.0Mbps unacked:44 rcv_space:43690 notsent:1047728 minrtt:200.011 uname -a Linux mkm 4.9.0-2-amd64 #1 SMP Debian 4.9.18-1 (2017-03-30) x86_64 GNU/Linux ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fw: [Bug 195713] New: TCP recv queue grows huge 2017-05-11 19:29 ` Michael Madsen @ 2017-05-11 19:42 ` Eric Dumazet 2017-05-11 22:24 ` [PATCH net] netem: fix skb_orphan_partial() Eric Dumazet 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-05-11 19:42 UTC (permalink / raw) To: Michael Madsen; +Cc: Stephen Hemminger, Eric Dumazet, netdev On Thu, 2017-05-11 at 21:29 +0200, Michael Madsen wrote: > > On 05/11/2017 07:06 PM, Eric Dumazet wrote: > > On Thu, 2017-05-11 at 09:47 -0700, Stephen Hemminger wrote: > >> Begin forwarded message: > >> > >> Date: Thu, 11 May 2017 13:25:23 +0000 > >> From: bugzilla-daemon@bugzilla.kernel.org > >> To: stephen@networkplumber.org > >> Subject: [Bug 195713] New: TCP recv queue grows huge > >> > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=195713 > >> > >> Bug ID: 195713 > >> Summary: TCP recv queue grows huge > >> Product: Networking > >> Version: 2.5 > >> Kernel Version: 3.13.0 4.4.0 4.9.0 > >> Hardware: All > >> OS: Linux > >> Tree: Mainline > >> Status: NEW > >> Severity: normal > >> Priority: P1 > >> Component: IPV4 > >> Assignee: stephen@networkplumber.org > >> Reporter: mkm@nabto.com > >> Regression: No > >> > >> I was testing how TCP handled advertising reductions of the window sizes > >> especially Window Full events. To create this setup I made a slow TCP receiver > >> and a fast TCP sender. To add some reality to the scenario I simulated 10ms > >> delay on the loopback device using the netem tc module. > >> > >> Steps to reproduce: > >> Bevare these steps will use all the memory on your system > >> > >> 1. create latency on loopback > >>> sudo tc qdisc change dev lo root netem delay 0ms > >> 2. slow tcp receiver: > >>> nc -l 4242 | pv -L 1k > >> 3. fast tcp sender: > >>> nc 127.0.0.1 4242 < /dev/zero > >> What to expect: > >> It is expected that the TCP recv queue is not groving unbounded e.g. the > >> following output from netstat: > >> > >>> netstat -an | grep 4242 > >>> tcp 5563486 0 127.0.0.1:4242 127.0.0.1:59113 > >>> ESTABLISHED > >>> tcp 0 3415559 127.0.0.1:59113 127.0.0.1:4242 > >>> ESTABLISHED > >> What is seen: > >> > >> The TCP receive queue grows until there is no more memory available on the > >> system. > >> > >>> netstat -an | grep 4242 > >>> tcp 223786525 0 127.0.0.1:4242 127.0.0.1:59114 > >>> ESTABLISHED > >>> tcp 0 4191037 127.0.0.1:59114 127.0.0.1:4242 > >>> ESTABLISHED > >> Note: After the TCP recv queue reaches ~ 2^31 bytes netstat reports a 0 which > >> is not correct, it has probably not been created with this bug in mind. > >> > >> Systems on which the bug reproducible: > >> > >> * debian testing, kernel 4.9.0 > >> * ubuntu 14.04, kernel 3.13.0 > >> * ubuntu 16.04, kernel 4.4.0 > >> > >> I have not testet on other systems than the above mentioned. > >> > > > > Not reproducible on my test machine. > > > > Somehow some sysctl must have been set to an insane value by > > mkm@nabto.com ? > > > > Please use/report ss -temoi instead of old netstat which does not > > provide info. > > > > lpaa23:~# tc -s -d qd sh dev lo > > qdisc netem 8002: root refcnt 2 limit 1000 > > Sent 1153017 bytes 388 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 0b 0p requeues 0 > > > > lpaa23:~# ss -temoi dst :4242 or src :4242 > > State Recv-Q Send-Q Local Address:Port Peer > > Address:Port > > ESTAB 0 3255206 127.0.0.1:35672 127.0.0.1:4242 > > timer:(persist,15sec,0) ino:3740676 sk:1 <-> > > skmem:(r0,rb1060272,t0,tb4194304,f2650,w3319206,o0,bl0,d0) ts sack > > cubic wscale:8,8 rto:230 backoff:7 rtt:20.879/26.142 mss:65483 > > rcvmss:536 advmss:65483 cwnd:19 ssthresh:19 bytes_acked:3258385 > > segs_out:86 segs_in:50 data_segs_out:68 send 476.7Mbps lastsnd:43940 > > lastrcv:163390 lastack:13500 pacing_rate 572.0Mbps delivery_rate > > 11146.0Mbps busy:163390ms rwnd_limited:163380ms(100.0%) retrans:0/1 > > rcv_space:43690 notsent:3255206 minrtt:0.002 > > ESTAB 3022864 0 127.0.0.1:4242 127.0.0.1:35672 > > ino:3703653 sk:2 <-> > > skmem:(r3259664,rb3406910,t0,tb2626560,f752,w0,o0,bl0,d17) ts sack > > cubic wscale:8,8 rto:210 rtt:0.019/0.009 ato:120 mss:21888 rcvmss:65483 > > advmss:65483 cwnd:10 bytes_received:3258384 segs_out:49 segs_in:86 > > data_segs_in:68 send 92160.0Mbps lastsnd:163390 lastrcv:43940 > > lastack:43940 rcv_rtt:0.239 rcv_space:61440 minrtt:0.019 > > > > > > lpaa23:~# uname -a > > Linux lpaa23 4.11.0-smp-DEV #197 SMP @1494476384 x86_64 GNU/Linux > > > > > > > > I've made an error in the bugreport, sorry, the tc step should set a > nonzero delay e.g. > tc qdisc change dev lo root netem delay 100ms > > tc -s -d qd sh dev lo > qdisc netem 8001: root refcnt 2 limit 1000 delay 100.0ms > Sent 2310729789 bytes 56051 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > netstat -an | grep 4242 > tcp 1737737598 0 127.0.0.1:4242 127.0.0.1:47724 ESTABLISHED > tcp 0 3734810 127.0.0.1:47724 127.0.0.1:4242 ESTABLISHED > > ss -temoi dst :4242 or src :4242 > State Recv-Q Send-Q Local Address:Port Peer > Address:Port > ESTAB 1771226600 0 127.0.0.1:4242 > 127.0.0.1:47724 uid:1000 ino:248318 sk:21 <-> > skmem:(r4292138050,rb5633129,t40,tb2626560,f3006,w0,o0,bl0,d0) ts sack > cubic wscale:7,7 rto:600 rtt:200.15/100.075 ato:40 mss:21888 cwnd:10 > bytes_received:1771576125 segs_out:13932 segs_in:27728 > data_segs_in:27726 send 8.7Mbps lastsnd:132144 lastrcv:4 lastack:4852 > pacing_rate 17.5Mbps rcv_rtt:202 rcv_space:188413 minrtt:200.15 > ESTAB 0 3866200 127.0.0.1:47724 > 127.0.0.1:4242 timer:(on,372ms,0) uid:1000 ino:246613 > sk:22 <-> > skmem:(r0,rb1061808,t4,tb4194304,f267688,w3943000,o0,bl0,d0) ts sack > cubic wscale:7,7 rto:404 rtt:200.112/0.058 mss:65483 cwnd:89 > bytes_acked:1769019586 segs_out:27732 segs_in:13913 data_segs_out:27730 > send 233.0Mbps lastsnd:32 lastrcv:26247708 lastack:32 pacing_rate > 466.0Mbps unacked:44 rcv_space:43690 notsent:1047728 minrtt:200.011 > > uname -a > Linux mkm 4.9.0-2-amd64 #1 SMP Debian 4.9.18-1 (2017-03-30) x86_64 GNU/Linux Oh this is a bug in netem, using skb_orphan_partial() even for packets that might loopback to this host. I will send a fix, thanks for the report. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net] netem: fix skb_orphan_partial() 2017-05-11 19:42 ` Eric Dumazet @ 2017-05-11 22:24 ` Eric Dumazet 2017-05-12 1:33 ` David Miller 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2017-05-11 22:24 UTC (permalink / raw) To: Michael Madsen, David Miller; +Cc: Stephen Hemminger, Eric Dumazet, netdev From: Eric Dumazet <edumazet@google.com> I should have known that lowering skb->truesize was dangerous :/ In case packets are not leaving the host via a standard Ethernet device, but looped back to local sockets, bad things can happen, as reported by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 ) So instead of tweaking skb->truesize, lets change skb->destructor and keep a reference on the owner socket via its sk_refcnt. Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Michael Madsen <mkm@nabto.com> --- net/core/sock.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 79c6aee6af9b817bd7086f04ae8f46342a3bf4b6..e43e71d7856b385111cd4c4b1bd835a78c670c60 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1803,28 +1803,24 @@ EXPORT_SYMBOL(skb_set_owner_w); * delay queue. We want to allow the owner socket to send more * packets, as if they were already TX completed by a typical driver. * But we also want to keep skb->sk set because some packet schedulers - * rely on it (sch_fq for example). So we set skb->truesize to a small - * amount (1) and decrease sk_wmem_alloc accordingly. + * rely on it (sch_fq for example). */ void skb_orphan_partial(struct sk_buff *skb) { - /* If this skb is a TCP pure ACK or already went here, - * we have nothing to do. 2 is already a very small truesize. - */ - if (skb->truesize <= 2) + if (skb_is_tcp_pure_ack(skb)) return; - /* TCP stack sets skb->ooo_okay based on sk_wmem_alloc, - * so we do not completely orphan skb, but transfert all - * accounted bytes but one, to avoid unexpected reorders. - */ if (skb->destructor == sock_wfree #ifdef CONFIG_INET || skb->destructor == tcp_wfree #endif ) { - atomic_sub(skb->truesize - 1, &skb->sk->sk_wmem_alloc); - skb->truesize = 1; + struct sock *sk = skb->sk; + + if (atomic_inc_not_zero(&sk->sk_refcnt)) { + atomic_sub(skb->truesize, &sk->sk_wmem_alloc); + skb->destructor = sock_efree; + } } else { skb_orphan(skb); } ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net] netem: fix skb_orphan_partial() 2017-05-11 22:24 ` [PATCH net] netem: fix skb_orphan_partial() Eric Dumazet @ 2017-05-12 1:33 ` David Miller 0 siblings, 0 replies; 6+ messages in thread From: David Miller @ 2017-05-12 1:33 UTC (permalink / raw) To: eric.dumazet; +Cc: mkm, stephen, edumazet, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 11 May 2017 15:24:41 -0700 > From: Eric Dumazet <edumazet@google.com> > > I should have known that lowering skb->truesize was dangerous :/ > > In case packets are not leaving the host via a standard Ethernet device, > but looped back to local sockets, bad things can happen, as reported > by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 ) > > So instead of tweaking skb->truesize, lets change skb->destructor > and keep a reference on the owner socket via its sk_refcnt. > > Fixes: f2f872f9272a ("netem: Introduce skb_orphan_partial() helper") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Michael Madsen <mkm@nabto.com> Applied and queued up for -stable, thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-05-12 1:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-05-11 16:47 Fw: [Bug 195713] New: TCP recv queue grows huge Stephen Hemminger 2017-05-11 17:06 ` Eric Dumazet 2017-05-11 19:29 ` Michael Madsen 2017-05-11 19:42 ` Eric Dumazet 2017-05-11 22:24 ` [PATCH net] netem: fix skb_orphan_partial() Eric Dumazet 2017-05-12 1:33 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).