* v3 for tcp friends? @ 2012-09-03 19:48 David Miller 2012-09-04 15:10 ` Bruce Curtis 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2012-09-03 19:48 UTC (permalink / raw) To: netdev; +Cc: brutus Bruce can we get a new spin of the patch? I'd really like to get this in soon so it gets a lot of testing before the next merge window opens up. Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2012-09-03 19:48 v3 for tcp friends? David Miller @ 2012-09-04 15:10 ` Bruce Curtis 2012-09-04 16:58 ` David Miller 0 siblings, 1 reply; 14+ messages in thread From: Bruce Curtis @ 2012-09-04 15:10 UTC (permalink / raw) To: David Miller; +Cc: netdev Will do, issues addressed, I'll get the patch out later today or tomorrow at the latest. On Mon, Sep 3, 2012 at 12:48 PM, David Miller <davem@davemloft.net> wrote: > > Bruce can we get a new spin of the patch? I'd really like to get this in > soon so it gets a lot of testing before the next merge window opens up. > > Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2012-09-04 15:10 ` Bruce Curtis @ 2012-09-04 16:58 ` David Miller 2013-01-21 7:29 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2012-09-04 16:58 UTC (permalink / raw) To: brutus; +Cc: netdev From: Bruce Curtis <brutus@google.com> Date: Tue, 4 Sep 2012 08:10:23 -0700 > Will do, issues addressed, I'll get the patch out later today or > tomorrow at the latest. Thanks a lot Bruce. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2012-09-04 16:58 ` David Miller @ 2013-01-21 7:29 ` Li Yu 2013-01-21 7:31 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-21 7:29 UTC (permalink / raw) To: David Miller; +Cc: brutus, netdev 于 2012年09月05日 00:58, David Miller 写道: > From: Bruce Curtis <brutus@google.com> > Date: Tue, 4 Sep 2012 08:10:23 -0700 > >> Will do, issues addressed, I'll get the patch out later today or >> tomorrow at the latest. > > Thanks a lot Bruce. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Hi, Bruce, I tested the TCP friends, found a bug here: [ 106.541372] Pid: 1765, comm: client Not tainted 3.7.0-rc1+ #25 [ 106.543121] Call Trace: [ 106.543950] [<ffffffff8133d212>] inet_sock_destruct+0x102/0x1f0 [ 106.545687] [<ffffffff812c38ad>] __sk_free+0x1d/0x110 [ 106.547209] [<ffffffff812c3a1c>] sk_free+0x1c/0x20 [ 106.548611] [<ffffffff8131680c>] tcp_close+0x6c/0x3f0 [ 106.549863] [<ffffffff8133caea>] inet_release+0xda/0xf0 [ 106.551134] [<ffffffff8133ca30>] ? inet_release+0x20/0xf0 [ 106.552419] [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10 [ 106.553658] [<ffffffff812bf948>] sock_release+0x28/0xa0 [ 106.557366] [<ffffffff812bfd69>] sock_close+0x29/0x30 [ 106.558831] [<ffffffff81128972>] __fput+0x122/0x210 [ 106.560541] [<ffffffff81128a6e>] ____fput+0xe/0x10 [ 106.562006] [<ffffffff8105354e>] task_work_run+0x9e/0xd0 [ 106.563285] [<ffffffff810027e1>] do_notify_resume+0x61/0x70 [ 106.564582] [<ffffffff8138a908>] int_signal+0x12/0x17 I also backported and tested it on stable kernel 3.7.3/RHEL6 kernel, this bug still exists. It seem that client may close listening sockets, may we need to add one reference count for listen socket before send its address to peer? And, our TCP friends v4? :) Thanks Yu ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-21 7:29 ` Li Yu @ 2013-01-21 7:31 ` Li Yu [not found] ` <CA+WLrf8xuJ3UXK-tZQykuTPMXu67WsNKWNuNzBKk7MBAidW-CQ@mail.gmail.com> 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-21 7:31 UTC (permalink / raw) To: David Miller; +Cc: brutus, netdev 于 2013年01月21日 15:29, Li Yu 写道: > 于 2012年09月05日 00:58, David Miller 写道: >> From: Bruce Curtis <brutus@google.com> >> Date: Tue, 4 Sep 2012 08:10:23 -0700 >> >>> Will do, issues addressed, I'll get the patch out later today or >>> tomorrow at the latest. >> >> Thanks a lot Bruce. >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > Hi, Bruce, > > I tested the TCP friends, found a bug here: > > [ 106.541372] Pid: 1765, comm: client Not tainted 3.7.0-rc1+ #25 > [ 106.543121] Call Trace: > [ 106.543950] [<ffffffff8133d212>] inet_sock_destruct+0x102/0x1f0 > [ 106.545687] [<ffffffff812c38ad>] __sk_free+0x1d/0x110 > [ 106.547209] [<ffffffff812c3a1c>] sk_free+0x1c/0x20 > [ 106.548611] [<ffffffff8131680c>] tcp_close+0x6c/0x3f0 > [ 106.549863] [<ffffffff8133caea>] inet_release+0xda/0xf0 > [ 106.551134] [<ffffffff8133ca30>] ? inet_release+0x20/0xf0 > [ 106.552419] [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10 > [ 106.553658] [<ffffffff812bf948>] sock_release+0x28/0xa0 > [ 106.557366] [<ffffffff812bfd69>] sock_close+0x29/0x30 > [ 106.558831] [<ffffffff81128972>] __fput+0x122/0x210 > [ 106.560541] [<ffffffff81128a6e>] ____fput+0xe/0x10 > [ 106.562006] [<ffffffff8105354e>] task_work_run+0x9e/0xd0 > [ 106.563285] [<ffffffff810027e1>] do_notify_resume+0x61/0x70 > [ 106.564582] [<ffffffff8138a908>] int_signal+0x12/0x17 > > > I also backported and tested it on stable kernel 3.7.3/RHEL6 > kernel, this bug still exists. It seem that client may close listening > sockets, may we need to add one reference count for listen socket > before send its address to peer? > Sorry, I lost an important line of kernel log before above them: [ 106.539367] IPv4: Attempt to release TCP socket in state 10 ffff880074abb5c0 BTW: state 10 = TCP_LISTEN > And, our TCP friends v4? :) > > Thanks > > Yu > ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <CA+WLrf8xuJ3UXK-tZQykuTPMXu67WsNKWNuNzBKk7MBAidW-CQ@mail.gmail.com>]
[parent not found: <CAEkNxbH0MAAU9oiwwrFTbMJP1yzfNdxD5NDdZOqGnvhTLvodoQ@mail.gmail.com>]
* Re: v3 for tcp friends? [not found] ` <CAEkNxbH0MAAU9oiwwrFTbMJP1yzfNdxD5NDdZOqGnvhTLvodoQ@mail.gmail.com> @ 2013-01-23 3:14 ` Li Yu 2013-01-23 6:12 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-23 3:14 UTC (permalink / raw) To: Bruce Curtis; +Cc: David Miller, netdev 于 2013年01月23日 05:08, Bruce Curtis 写道: > Thanks, Li > > Started working on friends again, v4, more soon. > > :) I found another odd bug in TCP friends v3, the clients may hang at tcp_sendmsg() -> sk_stream_wait_memory() with or without my refcnt fix patch. Below shell script can reproduce this bug: #! /bin/sh -x sysctl -w net.ipv4.tcp_rmem="4096 1073741824 1073741824" sysctl -w net.ipv4.tcp_wmem="4096 1073741824 1073741824" sysctl -w net.ipv4.tcp_friends=1 msg=64K buf=256M pkill -9 netserver netserver netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4 sysctl -w net.ipv4.tcp_friends=0 pkill -9 netserver netserver netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4 ##################SCRIPT END################### netperf kernel stack is (by cat /proc/$netperf_pid/stack) [<ffffffff812ce939>] sk_stream_wait_memory+0x2d9/0x2f0 [<ffffffff8131460c>] tcp_sendmsg+0xf6c/0x1240 [<ffffffff8133c117>] inet_sendmsg+0xf7/0x110 [<ffffffff812bedfd>] sock_sendmsg+0x7d/0xa0 [<ffffffff812c0e4d>] sys_sendto+0x13d/0x190 [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff netserver kernel stack is : [<ffffffff812c46ae>] sk_wait_data+0x8e/0xe0 [<ffffffff81315993>] tcp_recvmsg+0x5c3/0xbe0 [<ffffffff8133aefd>] inet_recvmsg+0xed/0x110 [<ffffffff812becf4>] sock_recvmsg+0x84/0xb0 [<ffffffff812c0fae>] sys_recvfrom+0xee/0x170 [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And, "netstat -tnp" give us below results: Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:35451 127.0.0.1:12865 ESTABLISHED 2316/netperf tcp6 0 0 127.0.0.1:12865 127.0.0.1:35451 ESTABLISHED 2317/netserver (It seems that netperf hangs on the control connection of benchmark) I also try to fix this ... Thanks Yu > On Mon, Jan 21, 2013 at 12:55 AM, Li Yu <raise.sail@gmail.com > <mailto:raise.sail@gmail.com>> wrote: > > 2013/1/21 Li Yu <raise.sail@gmail.com <mailto:raise.sail@gmail.com>> > > 于 2013年01月21日 15:29, Li Yu 写道: > > 于 2012年09月05日 00:58, David Miller 写道: > > From: Bruce Curtis <brutus@google.com > <mailto:brutus@google.com>> > Date: Tue, 4 Sep 2012 08:10:23 -0700 > > Will do, issues addressed, I'll get the patch out > later today or > tomorrow at the latest. > > > Thanks a lot Bruce. > -- > To unsubscribe from this list: send the line > "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > <mailto:majordomo@vger.kernel.org> > More majordomo info at > http://vger.kernel.org/__majordomo-info.html > <http://vger.kernel.org/majordomo-info.html> > > > > Hi, Bruce, > > I tested the TCP friends, found a bug here: > > [ 106.541372] Pid: 1765, comm: client Not tainted > 3.7.0-rc1+ #25 > [ 106.543121] Call Trace: > [ 106.543950] [<ffffffff8133d212>] > inet_sock_destruct+0x102/0x1f0 > [ 106.545687] [<ffffffff812c38ad>] __sk_free+0x1d/0x110 > [ 106.547209] [<ffffffff812c3a1c>] sk_free+0x1c/0x20 > [ 106.548611] [<ffffffff8131680c>] tcp_close+0x6c/0x3f0 > [ 106.549863] [<ffffffff8133caea>] inet_release+0xda/0xf0 > [ 106.551134] [<ffffffff8133ca30>] ? inet_release+0x20/0xf0 > [ 106.552419] [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10 > [ 106.553658] [<ffffffff812bf948>] sock_release+0x28/0xa0 > [ 106.557366] [<ffffffff812bfd69>] sock_close+0x29/0x30 > [ 106.558831] [<ffffffff81128972>] __fput+0x122/0x210 > [ 106.560541] [<ffffffff81128a6e>] ____fput+0xe/0x10 > [ 106.562006] [<ffffffff8105354e>] task_work_run+0x9e/0xd0 > [ 106.563285] [<ffffffff810027e1>] do_notify_resume+0x61/0x70 > [ 106.564582] [<ffffffff8138a908>] int_signal+0x12/0x17 > > > I also backported and tested it on stable kernel > 3.7.3/RHEL6 > kernel, this bug still exists. It seem that client may close > listening > sockets, may we need to add one reference count for listen > socket > before send its address to peer? > > > Sorry, I lost an important line of kernel log before above them: > > [ 106.539367] IPv4: Attempt to release TCP socket in state 10 > ffff880074abb5c0 > > BTW: state 10 = TCP_LISTEN > > > It seem this patch works for me. > > diff --git a/net/ipv4/inet_connection_sock.c > b/net/ipv4/inet_connection_sock.c > index 9641215..a625c02 100644 > --- a/net/ipv4/inet_connection_sock.c > +++ b/net/ipv4/inet_connection_sock.c > @@ -623,8 +623,11 @@ struct sock *inet_csk_clone(struct sock *sk, > const struct request_sock *req, > sock_hold(newsk); > was = xchg(&req->friend->sk_friend, newsk); > /* If requester already connect()ed, maybe > sleeping */ > - if (was && !sock_flag(req->friend, SOCK_DEAD)) > - sk->sk_state_change(req->friend); > + if (was) { > + if (!sock_flag(req->friend, SOCK_DEAD)) > + > sk->sk_state_change(req->friend); > + sock_put(was); > + } > } > newsk->sk_state = TCP_SYN_RECV; > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 5917485..7a63245 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2277,8 +2277,10 @@ struct sk_buff *tcp_make_synack(struct sock > *sk, struct dst_entry *dst, > memset(&opts, 0, sizeof(opts)); > /* Only try to make friends if enabled */ > - if (sysctl_tcp_friends) > + if (sysctl_tcp_friends) { > + sock_hold(sk); > skb->friend = sk; > + } > #ifdef CONFIG_SYN_COOKIES > if (unlikely(req->cookie_ts)) > > > And, our TCP friends v4? :) > > Thanks > > Yu > > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 3:14 ` Li Yu @ 2013-01-23 6:12 ` Li Yu 2013-01-23 6:46 ` Eric Dumazet 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-23 6:12 UTC (permalink / raw) To: Bruce Curtis; +Cc: David Miller, netdev Oops, this hang is not since TCP friends patch! sk_sndbuf_get() is broken by 32 bits integer overflow because of so large value in net.ipv4.tcp_{rmem,wmem}. but this hang also can be found in net-next.git (3.8.0-rc3+), if we run below commands, then all new TCP connections stop working! # when TCP friends is disabled sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" Thanks Yu 于 2013年01月23日 11:14, Li Yu 写道: > 于 2013年01月23日 05:08, Bruce Curtis 写道: >> Thanks, Li >> >> Started working on friends again, v4, more soon. >> >> > > :) > > I found another odd bug in TCP friends v3, the clients > may hang at tcp_sendmsg() -> sk_stream_wait_memory() with or > without my refcnt fix patch. > > Below shell script can reproduce this bug: > > #! /bin/sh -x > > sysctl -w net.ipv4.tcp_rmem="4096 1073741824 1073741824" > sysctl -w net.ipv4.tcp_wmem="4096 1073741824 1073741824" > > sysctl -w net.ipv4.tcp_friends=1 > > msg=64K > buf=256M > > pkill -9 netserver > netserver > netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4 > > sysctl -w net.ipv4.tcp_friends=0 > pkill -9 netserver > netserver > netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4 > ##################SCRIPT END################### > > netperf kernel stack is (by cat /proc/$netperf_pid/stack) > > [<ffffffff812ce939>] sk_stream_wait_memory+0x2d9/0x2f0 > [<ffffffff8131460c>] tcp_sendmsg+0xf6c/0x1240 > [<ffffffff8133c117>] inet_sendmsg+0xf7/0x110 > [<ffffffff812bedfd>] sock_sendmsg+0x7d/0xa0 > [<ffffffff812c0e4d>] sys_sendto+0x13d/0x190 > [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > netserver kernel stack is : > > [<ffffffff812c46ae>] sk_wait_data+0x8e/0xe0 > [<ffffffff81315993>] tcp_recvmsg+0x5c3/0xbe0 > [<ffffffff8133aefd>] inet_recvmsg+0xed/0x110 > [<ffffffff812becf4>] sock_recvmsg+0x84/0xb0 > [<ffffffff812c0fae>] sys_recvfrom+0xee/0x170 > [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > And, "netstat -tnp" give us below results: > > Active Internet connections (w/o servers) > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > tcp 0 0 127.0.0.1:35451 127.0.0.1:12865 > ESTABLISHED 2316/netperf > tcp6 0 0 127.0.0.1:12865 127.0.0.1:35451 > ESTABLISHED 2317/netserver > > (It seems that netperf hangs on the control connection of benchmark) > > I also try to fix this ... > > Thanks > > Yu > >> On Mon, Jan 21, 2013 at 12:55 AM, Li Yu <raise.sail@gmail.com >> <mailto:raise.sail@gmail.com>> wrote: >> >> 2013/1/21 Li Yu <raise.sail@gmail.com <mailto:raise.sail@gmail.com>> >> >> 于 2013年01月21日 15:29, Li Yu 写道: >> >> 于 2012年09月05日 00:58, David Miller 写道: >> >> From: Bruce Curtis <brutus@google.com >> <mailto:brutus@google.com>> >> Date: Tue, 4 Sep 2012 08:10:23 -0700 >> >> Will do, issues addressed, I'll get the patch out >> later today or >> tomorrow at the latest. >> >> >> Thanks a lot Bruce. >> -- >> To unsubscribe from this list: send the line >> "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> <mailto:majordomo@vger.kernel.org> >> More majordomo info at >> http://vger.kernel.org/__majordomo-info.html >> <http://vger.kernel.org/majordomo-info.html> >> >> >> >> Hi, Bruce, >> >> I tested the TCP friends, found a bug here: >> >> [ 106.541372] Pid: 1765, comm: client Not tainted >> 3.7.0-rc1+ #25 >> [ 106.543121] Call Trace: >> [ 106.543950] [<ffffffff8133d212>] >> inet_sock_destruct+0x102/0x1f0 >> [ 106.545687] [<ffffffff812c38ad>] __sk_free+0x1d/0x110 >> [ 106.547209] [<ffffffff812c3a1c>] sk_free+0x1c/0x20 >> [ 106.548611] [<ffffffff8131680c>] tcp_close+0x6c/0x3f0 >> [ 106.549863] [<ffffffff8133caea>] inet_release+0xda/0xf0 >> [ 106.551134] [<ffffffff8133ca30>] ? inet_release+0x20/0xf0 >> [ 106.552419] [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10 >> [ 106.553658] [<ffffffff812bf948>] sock_release+0x28/0xa0 >> [ 106.557366] [<ffffffff812bfd69>] sock_close+0x29/0x30 >> [ 106.558831] [<ffffffff81128972>] __fput+0x122/0x210 >> [ 106.560541] [<ffffffff81128a6e>] ____fput+0xe/0x10 >> [ 106.562006] [<ffffffff8105354e>] task_work_run+0x9e/0xd0 >> [ 106.563285] [<ffffffff810027e1>] do_notify_resume+0x61/0x70 >> [ 106.564582] [<ffffffff8138a908>] int_signal+0x12/0x17 >> >> >> I also backported and tested it on stable kernel >> 3.7.3/RHEL6 >> kernel, this bug still exists. It seem that client may close >> listening >> sockets, may we need to add one reference count for listen >> socket >> before send its address to peer? >> >> >> Sorry, I lost an important line of kernel log before above them: >> >> [ 106.539367] IPv4: Attempt to release TCP socket in state 10 >> ffff880074abb5c0 >> >> BTW: state 10 = TCP_LISTEN >> >> >> It seem this patch works for me. >> >> diff --git a/net/ipv4/inet_connection_sock.c >> b/net/ipv4/inet_connection_sock.c >> index 9641215..a625c02 100644 >> --- a/net/ipv4/inet_connection_sock.c >> +++ b/net/ipv4/inet_connection_sock.c >> @@ -623,8 +623,11 @@ struct sock *inet_csk_clone(struct sock *sk, >> const struct request_sock *req, >> sock_hold(newsk); >> was = xchg(&req->friend->sk_friend, newsk); >> /* If requester already connect()ed, maybe >> sleeping */ >> - if (was && !sock_flag(req->friend, SOCK_DEAD)) >> - sk->sk_state_change(req->friend); >> + if (was) { >> + if (!sock_flag(req->friend, SOCK_DEAD)) >> + >> sk->sk_state_change(req->friend); >> + sock_put(was); >> + } >> } >> newsk->sk_state = TCP_SYN_RECV; >> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c >> index 5917485..7a63245 100644 >> --- a/net/ipv4/tcp_output.c >> +++ b/net/ipv4/tcp_output.c >> @@ -2277,8 +2277,10 @@ struct sk_buff *tcp_make_synack(struct sock >> *sk, struct dst_entry *dst, >> memset(&opts, 0, sizeof(opts)); >> /* Only try to make friends if enabled */ >> - if (sysctl_tcp_friends) >> + if (sysctl_tcp_friends) { >> + sock_hold(sk); >> skb->friend = sk; >> + } >> #ifdef CONFIG_SYN_COOKIES >> if (unlikely(req->cookie_ts)) >> >> >> And, our TCP friends v4? :) >> >> Thanks >> >> Yu >> >> >> > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 6:12 ` Li Yu @ 2013-01-23 6:46 ` Eric Dumazet 2013-01-23 7:21 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: Eric Dumazet @ 2013-01-23 6:46 UTC (permalink / raw) To: Li Yu; +Cc: Bruce Curtis, David Miller, netdev On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: > Oops, this hang is not since TCP friends patch! > > sk_sndbuf_get() is broken by 32 bits integer overflow > because of so large value in net.ipv4.tcp_{rmem,wmem}. > > but this hang also can be found in net-next.git > (3.8.0-rc3+), if we run below commands, then all new > TCP connections stop working! > > # when TCP friends is disabled > sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB > sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" Right we need to make sure we dont overflow. Try the following fix : diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index a25e1d2..1459145 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { .data = &sysctl_tcp_wmem, .maxlen = sizeof(sysctl_tcp_wmem), .mode = 0644, - .proc_handler = proc_dointvec + .extra1 = &zero, + .proc_handler = proc_dointvec_minmax }, { .procname = "tcp_rmem", .data = &sysctl_tcp_rmem, .maxlen = sizeof(sysctl_tcp_rmem), .mode = 0644, - .proc_handler = proc_dointvec + .extra1 = &zero, + .proc_handler = proc_dointvec_minmax }, { .procname = "tcp_app_win", ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 6:46 ` Eric Dumazet @ 2013-01-23 7:21 ` Li Yu 2013-01-23 7:58 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-23 7:21 UTC (permalink / raw) To: netdev 于 2013年01月23日 14:46, Eric Dumazet 写道: > On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >> Oops, this hang is not since TCP friends patch! >> >> sk_sndbuf_get() is broken by 32 bits integer overflow >> because of so large value in net.ipv4.tcp_{rmem,wmem}. >> >> but this hang also can be found in net-next.git >> (3.8.0-rc3+), if we run below commands, then all new >> TCP connections stop working! >> >> # when TCP friends is disabled >> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" > > Right we need to make sure we dont overflow. > > Try the following fix : > > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c > index a25e1d2..1459145 100644 > --- a/net/ipv4/sysctl_net_ipv4.c > +++ b/net/ipv4/sysctl_net_ipv4.c > @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { > .data = &sysctl_tcp_wmem, > .maxlen = sizeof(sysctl_tcp_wmem), > .mode = 0644, > - .proc_handler = proc_dointvec > + .extra1 = &zero, > + .proc_handler = proc_dointvec_minmax > }, > { > .procname = "tcp_rmem", > .data = &sysctl_tcp_rmem, > .maxlen = sizeof(sysctl_tcp_rmem), > .mode = 0644, > - .proc_handler = proc_dointvec > + .extra1 = &zero, > + .proc_handler = proc_dointvec_minmax > }, > { > .procname = "tcp_app_win", > > > Thanks for so quick reply, I will test it soon. however I suspect whether this patch could fix overflow if we merged TCP friends patch in future. With TCP friends, we have source like below, or TCP friends should care this ? static inline int sk_sndbuf_get(const struct sock *sk) { if (sk->sk_friend) return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf; else return sk->sk_sndbuf; } static inline bool sk_stream_memory_free(const struct sock *sk) { return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk); } So sk_sndbuf_get() still may be overflow when we have right value in net.ipv4.tcp_{rmem,wmem}. Thanks ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 7:21 ` Li Yu @ 2013-01-23 7:58 ` Li Yu 2013-01-23 9:39 ` Li Yu 0 siblings, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-23 7:58 UTC (permalink / raw) To: netdev; +Cc: David Miller, Eric Dumazet, Bruce Curtis 于 2013年01月23日 15:21, Li Yu 写道: > 于 2013年01月23日 14:46, Eric Dumazet 写道: >> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >>> Oops, this hang is not since TCP friends patch! >>> >>> sk_sndbuf_get() is broken by 32 bits integer overflow >>> because of so large value in net.ipv4.tcp_{rmem,wmem}. >>> >>> but this hang also can be found in net-next.git >>> (3.8.0-rc3+), if we run below commands, then all new >>> TCP connections stop working! >>> >>> # when TCP friends is disabled >>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" >> >> Right we need to make sure we dont overflow. >> >> Try the following fix : >> >> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c >> index a25e1d2..1459145 100644 >> --- a/net/ipv4/sysctl_net_ipv4.c >> +++ b/net/ipv4/sysctl_net_ipv4.c >> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { >> .data = &sysctl_tcp_wmem, >> .maxlen = sizeof(sysctl_tcp_wmem), >> .mode = 0644, >> - .proc_handler = proc_dointvec >> + .extra1 = &zero, >> + .proc_handler = proc_dointvec_minmax >> }, >> { >> .procname = "tcp_rmem", >> .data = &sysctl_tcp_rmem, >> .maxlen = sizeof(sysctl_tcp_rmem), >> .mode = 0644, >> - .proc_handler = proc_dointvec >> + .extra1 = &zero, >> + .proc_handler = proc_dointvec_minmax >> }, >> { >> .procname = "tcp_app_win", >> >> >> > Thanks for so quick reply, I will test it soon. > > however I suspect whether this patch could fix overflow if we merged TCP > friends patch in future. > Tested on 3.7.0-rc1+, but bug is still alive with disabled TCP friends ... Thanks Yu > With TCP friends, we have source like below, or TCP friends should care > this ? > > static inline int sk_sndbuf_get(const struct sock *sk) > { > if (sk->sk_friend) > return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf; > else > return sk->sk_sndbuf; > } > > static inline bool sk_stream_memory_free(const struct sock *sk) > { > return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk); > } > > So sk_sndbuf_get() still may be overflow when we have right value in > net.ipv4.tcp_{rmem,wmem}. > > Thanks ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 7:58 ` Li Yu @ 2013-01-23 9:39 ` Li Yu 2013-01-23 9:52 ` Li Yu 2013-01-24 2:04 ` Xiaotian Feng 0 siblings, 2 replies; 14+ messages in thread From: Li Yu @ 2013-01-23 9:39 UTC (permalink / raw) To: netdev; +Cc: David Miller, Eric Dumazet, Bruce Curtis 于 2013年01月23日 15:58, Li Yu 写道: > 于 2013年01月23日 15:21, Li Yu 写道: >> 于 2013年01月23日 14:46, Eric Dumazet 写道: >>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >>>> Oops, this hang is not since TCP friends patch! >>>> >>>> sk_sndbuf_get() is broken by 32 bits integer overflow >>>> because of so large value in net.ipv4.tcp_{rmem,wmem}. >>>> >>>> but this hang also can be found in net-next.git >>>> (3.8.0-rc3+), if we run below commands, then all new >>>> TCP connections stop working! >>>> >>>> # when TCP friends is disabled >>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" >>> >>> Right we need to make sure we dont overflow. >>> >>> Try the following fix : >>> >>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c >>> index a25e1d2..1459145 100644 >>> --- a/net/ipv4/sysctl_net_ipv4.c >>> +++ b/net/ipv4/sysctl_net_ipv4.c >>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { >>> .data = &sysctl_tcp_wmem, >>> .maxlen = sizeof(sysctl_tcp_wmem), >>> .mode = 0644, >>> - .proc_handler = proc_dointvec >>> + .extra1 = &zero, If we added below: +static int one = 1; +static int int_max = INT_MAX; .... + .extra1 = &one, + .extra2 = &int_max, The bug is fixed without TCP friends. Maybe, TCP friends need to use long integer to record result of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ? BTW: This overflow problem also breaks UDP sockets. Thanks Yu >>> + .proc_handler = proc_dointvec_minmax >>> }, >>> { >>> .procname = "tcp_rmem", >>> .data = &sysctl_tcp_rmem, >>> .maxlen = sizeof(sysctl_tcp_rmem), >>> .mode = 0644, >>> - .proc_handler = proc_dointvec >>> + .extra1 = &zero, >>> + .proc_handler = proc_dointvec_minmax >>> }, >>> { >>> .procname = "tcp_app_win", >>> >>> >>> >> Thanks for so quick reply, I will test it soon. >> >> however I suspect whether this patch could fix overflow if we merged TCP >> friends patch in future. >> > > Tested on 3.7.0-rc1+, but bug is still alive > with disabled TCP friends ... > > Thanks > > Yu > >> With TCP friends, we have source like below, or TCP friends should care >> this ? >> >> static inline int sk_sndbuf_get(const struct sock *sk) >> { >> if (sk->sk_friend) >> return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf; >> else >> return sk->sk_sndbuf; >> } >> >> static inline bool sk_stream_memory_free(const struct sock *sk) >> { >> return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk); >> } >> >> So sk_sndbuf_get() still may be overflow when we have right value in >> net.ipv4.tcp_{rmem,wmem}. >> >> Thanks > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 9:39 ` Li Yu @ 2013-01-23 9:52 ` Li Yu 2013-01-23 13:16 ` Weiping Pan 2013-01-24 2:04 ` Xiaotian Feng 1 sibling, 1 reply; 14+ messages in thread From: Li Yu @ 2013-01-23 9:52 UTC (permalink / raw) To: netdev; +Cc: David Miller, Eric Dumazet, Bruce Curtis 于 2013年01月23日 17:39, Li Yu 写道: > 于 2013年01月23日 15:58, Li Yu 写道: >> 于 2013年01月23日 15:21, Li Yu 写道: >>> 于 2013年01月23日 14:46, Eric Dumazet 写道: >>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >>>>> Oops, this hang is not since TCP friends patch! >>>>> >>>>> sk_sndbuf_get() is broken by 32 bits integer overflow >>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}. >>>>> >>>>> but this hang also can be found in net-next.git >>>>> (3.8.0-rc3+), if we run below commands, then all new >>>>> TCP connections stop working! >>>>> >>>>> # when TCP friends is disabled >>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" >>>> >>>> Right we need to make sure we dont overflow. >>>> >>>> Try the following fix : >>>> >>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c >>>> index a25e1d2..1459145 100644 >>>> --- a/net/ipv4/sysctl_net_ipv4.c >>>> +++ b/net/ipv4/sysctl_net_ipv4.c >>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { >>>> .data = &sysctl_tcp_wmem, >>>> .maxlen = sizeof(sysctl_tcp_wmem), >>>> .mode = 0644, >>>> - .proc_handler = proc_dointvec >>>> + .extra1 = &zero, > > If we added below: > > +static int one = 1; > +static int int_max = INT_MAX; > .... > + .extra1 = &one, > + .extra2 = &int_max, > The "int_max" may be unnecessary here :) > The bug is fixed without TCP friends. > > Maybe, TCP friends need to use long integer to record result > of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ? > > BTW: This overflow problem also breaks UDP sockets. Unix domain sockets is broken too, and any other ? thanks Yu > > Thanks > > Yu > >>>> + .proc_handler = proc_dointvec_minmax >>>> }, >>>> { >>>> .procname = "tcp_rmem", >>>> .data = &sysctl_tcp_rmem, >>>> .maxlen = sizeof(sysctl_tcp_rmem), >>>> .mode = 0644, >>>> - .proc_handler = proc_dointvec >>>> + .extra1 = &zero, >>>> + .proc_handler = proc_dointvec_minmax >>>> }, >>>> { >>>> .procname = "tcp_app_win", >>>> >>>> >>>> >>> Thanks for so quick reply, I will test it soon. >>> >>> however I suspect whether this patch could fix overflow if we merged TCP >>> friends patch in future. >>> >> >> Tested on 3.7.0-rc1+, but bug is still alive >> with disabled TCP friends ... >> >> Thanks >> >> Yu >> >>> With TCP friends, we have source like below, or TCP friends should care >>> this ? >>> >>> static inline int sk_sndbuf_get(const struct sock *sk) >>> { >>> if (sk->sk_friend) >>> return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf; >>> else >>> return sk->sk_sndbuf; >>> } >>> >>> static inline bool sk_stream_memory_free(const struct sock *sk) >>> { >>> return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk); >>> } >>> >>> So sk_sndbuf_get() still may be overflow when we have right value in >>> net.ipv4.tcp_{rmem,wmem}. >>> >>> Thanks >> > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 9:52 ` Li Yu @ 2013-01-23 13:16 ` Weiping Pan 0 siblings, 0 replies; 14+ messages in thread From: Weiping Pan @ 2013-01-23 13:16 UTC (permalink / raw) To: Li Yu; +Cc: netdev, David Miller, Eric Dumazet, Bruce Curtis On 01/23/2013 05:52 PM, Li Yu wrote: > 于 2013年01月23日 17:39, Li Yu 写道: >> 于 2013年01月23日 15:58, Li Yu 写道: >>> 于 2013年01月23日 15:21, Li Yu 写道: >>>> 于 2013年01月23日 14:46, Eric Dumazet 写道: >>>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >>>>>> Oops, this hang is not since TCP friends patch! >>>>>> >>>>>> sk_sndbuf_get() is broken by 32 bits integer overflow >>>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}. >>>>>> >>>>>> but this hang also can be found in net-next.git >>>>>> (3.8.0-rc3+), if we run below commands, then all new >>>>>> TCP connections stop working! >>>>>> >>>>>> # when TCP friends is disabled >>>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >>>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" >>>>> >>>>> Right we need to make sure we dont overflow. >>>>> >>>>> Try the following fix : >>>>> >>>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c >>>>> index a25e1d2..1459145 100644 >>>>> --- a/net/ipv4/sysctl_net_ipv4.c >>>>> +++ b/net/ipv4/sysctl_net_ipv4.c >>>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { >>>>> .data = &sysctl_tcp_wmem, >>>>> .maxlen = sizeof(sysctl_tcp_wmem), >>>>> .mode = 0644, >>>>> - .proc_handler = proc_dointvec >>>>> + .extra1 = &zero, >> >> If we added below: >> >> +static int one = 1; >> +static int int_max = INT_MAX; >> .... >> + .extra1 = &one, >> + .extra2 = &int_max, >> > > The "int_max" may be unnecessary here :) Hi, Li Yu, I tested that your patch works fine. Can you post a complete patch ? thanks Weiping Pan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: v3 for tcp friends? 2013-01-23 9:39 ` Li Yu 2013-01-23 9:52 ` Li Yu @ 2013-01-24 2:04 ` Xiaotian Feng 1 sibling, 0 replies; 14+ messages in thread From: Xiaotian Feng @ 2013-01-24 2:04 UTC (permalink / raw) To: Li Yu; +Cc: netdev, David Miller, Eric Dumazet, Bruce Curtis On Wed, Jan 23, 2013 at 5:39 PM, Li Yu <raise.sail@gmail.com> wrote: > 于 2013年01月23日 15:58, Li Yu 写道: > >> 于 2013年01月23日 15:21, Li Yu 写道: >>> >>> 于 2013年01月23日 14:46, Eric Dumazet 写道: >>>> >>>> On Wed, 2013-01-23 at 14:12 +0800, Li Yu wrote: >>>>> >>>>> Oops, this hang is not since TCP friends patch! >>>>> >>>>> sk_sndbuf_get() is broken by 32 bits integer overflow >>>>> because of so large value in net.ipv4.tcp_{rmem,wmem}. >>>>> >>>>> but this hang also can be found in net-next.git >>>>> (3.8.0-rc3+), if we run below commands, then all new >>>>> TCP connections stop working! >>>>> >>>>> # when TCP friends is disabled >>>>> sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB >>>>> sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296" >>>> >>>> >>>> Right we need to make sure we dont overflow. >>>> >>>> Try the following fix : >>>> >>>> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c >>>> index a25e1d2..1459145 100644 >>>> --- a/net/ipv4/sysctl_net_ipv4.c >>>> +++ b/net/ipv4/sysctl_net_ipv4.c >>>> @@ -549,14 +549,16 @@ static struct ctl_table ipv4_table[] = { >>>> .data = &sysctl_tcp_wmem, >>>> .maxlen = sizeof(sysctl_tcp_wmem), >>>> .mode = 0644, >>>> - .proc_handler = proc_dointvec >>>> + .extra1 = &zero, > > > If we added below: > > +static int one = 1; one is defined in kernel/sysctl.c > +static int int_max = INT_MAX; > .... > + .extra1 = &one, > + .extra2 = &int_max, I believe INT_MAX does not have actual effect, because any value bigger than INT_MAX will be cast to negative value, which is prevented by extra1... > > The bug is fixed without TCP friends. > > Maybe, TCP friends need to use long integer to record result > of "sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf" ? > > BTW: This overflow problem also breaks UDP sockets. > > Thanks > > Yu > > >>>> + .proc_handler = proc_dointvec_minmax >>>> }, >>>> { >>>> .procname = "tcp_rmem", >>>> .data = &sysctl_tcp_rmem, >>>> .maxlen = sizeof(sysctl_tcp_rmem), >>>> .mode = 0644, >>>> - .proc_handler = proc_dointvec >>>> + .extra1 = &zero, >>>> + .proc_handler = proc_dointvec_minmax >>>> }, >>>> { >>>> .procname = "tcp_app_win", >>>> >>>> >>>> >>> Thanks for so quick reply, I will test it soon. >>> >>> however I suspect whether this patch could fix overflow if we merged TCP >>> friends patch in future. >>> >> >> Tested on 3.7.0-rc1+, but bug is still alive >> with disabled TCP friends ... >> >> Thanks >> >> Yu >> >>> With TCP friends, we have source like below, or TCP friends should care >>> this ? >>> >>> static inline int sk_sndbuf_get(const struct sock *sk) >>> { >>> if (sk->sk_friend) >>> return sk->sk_sndbuf + sk->sk_friend->sk_rcvbuf; >>> else >>> return sk->sk_sndbuf; >>> } >>> >>> static inline bool sk_stream_memory_free(const struct sock *sk) >>> { >>> return sk_wmem_queued_get(sk) < sk_sndbuf_get(sk); >>> } >>> >>> So sk_sndbuf_get() still may be overflow when we have right value in >>> net.ipv4.tcp_{rmem,wmem}. >>> >>> Thanks >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2013-01-24 2:11 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-03 19:48 v3 for tcp friends? David Miller 2012-09-04 15:10 ` Bruce Curtis 2012-09-04 16:58 ` David Miller 2013-01-21 7:29 ` Li Yu 2013-01-21 7:31 ` Li Yu [not found] ` <CA+WLrf8xuJ3UXK-tZQykuTPMXu67WsNKWNuNzBKk7MBAidW-CQ@mail.gmail.com> [not found] ` <CAEkNxbH0MAAU9oiwwrFTbMJP1yzfNdxD5NDdZOqGnvhTLvodoQ@mail.gmail.com> 2013-01-23 3:14 ` Li Yu 2013-01-23 6:12 ` Li Yu 2013-01-23 6:46 ` Eric Dumazet 2013-01-23 7:21 ` Li Yu 2013-01-23 7:58 ` Li Yu 2013-01-23 9:39 ` Li Yu 2013-01-23 9:52 ` Li Yu 2013-01-23 13:16 ` Weiping Pan 2013-01-24 2:04 ` Xiaotian Feng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).