All of lore.kernel.org
 help / color / mirror / Atom feed
From: Li Yu <raise.sail@gmail.com>
To: Bruce Curtis <brutus@google.com>
Cc: David Miller <davem@davemloft.net>, netdev <netdev@vger.kernel.org>
Subject: Re: v3 for tcp friends?
Date: Wed, 23 Jan 2013 14:12:52 +0800	[thread overview]
Message-ID: <50FF7F64.2060902@gmail.com> (raw)
In-Reply-To: <50FF5592.60008@gmail.com>


Oops, this hang is not since TCP friends patch!

sk_sndbuf_get() is broken by 32 bits integer overflow
because of so large value in net.ipv4.tcp_{rmem,wmem}.

but this hang also can be found in net-next.git
(3.8.0-rc3+), if we run below commands, then all new
TCP connections stop working!

# when TCP friends is disabled
sysctl -w net.ipv4.tcp_rmem="4096 4294967296 4294967296" # 4GB
sysctl -w net.ipv4.tcp_wmem="4096 4294967296 4294967296"

Thanks

Yu

于 2013年01月23日 11:14, Li Yu 写道:
> 于 2013年01月23日 05:08, Bruce Curtis 写道:
>> Thanks, Li
>>
>> Started working on friends again, v4, more soon.
>>
>>
> 
> :)
> 
> I found another odd bug in TCP friends v3, the clients
> may hang at tcp_sendmsg() -> sk_stream_wait_memory() with or
> without my refcnt fix patch.
> 
> Below shell script can reproduce this bug:
> 
> #! /bin/sh -x
> 
> sysctl -w net.ipv4.tcp_rmem="4096 1073741824 1073741824"
> sysctl -w net.ipv4.tcp_wmem="4096 1073741824 1073741824"
> 
> sysctl -w net.ipv4.tcp_friends=1
> 
> msg=64K
> buf=256M
> 
> pkill -9 netserver
> netserver
> netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4
> 
> sysctl -w net.ipv4.tcp_friends=0
> pkill -9 netserver
> netserver
> netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4
> ##################SCRIPT END###################
> 
> netperf kernel stack is (by cat /proc/$netperf_pid/stack)
> 
> [<ffffffff812ce939>] sk_stream_wait_memory+0x2d9/0x2f0
> [<ffffffff8131460c>] tcp_sendmsg+0xf6c/0x1240
> [<ffffffff8133c117>] inet_sendmsg+0xf7/0x110
> [<ffffffff812bedfd>] sock_sendmsg+0x7d/0xa0
> [<ffffffff812c0e4d>] sys_sendto+0x13d/0x190
> [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> netserver kernel stack is :
> 
> [<ffffffff812c46ae>] sk_wait_data+0x8e/0xe0
> [<ffffffff81315993>] tcp_recvmsg+0x5c3/0xbe0
> [<ffffffff8133aefd>] inet_recvmsg+0xed/0x110
> [<ffffffff812becf4>] sock_recvmsg+0x84/0xb0
> [<ffffffff812c0fae>] sys_recvfrom+0xee/0x170
> [<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> And, "netstat -tnp" give us below results:
> 
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address
> State       PID/Program name
> tcp        0      0 127.0.0.1:35451         127.0.0.1:12865
> ESTABLISHED 2316/netperf
> tcp6       0      0 127.0.0.1:12865         127.0.0.1:35451
> ESTABLISHED 2317/netserver
> 
> (It seems that netperf hangs on the control connection of benchmark)
> 
> I also try to fix this ...
> 
> Thanks
> 
> Yu
> 
>> On Mon, Jan 21, 2013 at 12:55 AM, Li Yu <raise.sail@gmail.com
>> <mailto:raise.sail@gmail.com>> wrote:
>>
>>      2013/1/21 Li Yu <raise.sail@gmail.com <mailto:raise.sail@gmail.com>>
>>
>>          于 2013年01月21日 15:29, Li Yu 写道:
>>
>>              于 2012年09月05日 00:58, David Miller 写道:
>>
>>                  From: Bruce Curtis <brutus@google.com
>>                  <mailto:brutus@google.com>>
>>                  Date: Tue, 4 Sep 2012 08:10:23 -0700
>>
>>                      Will do, issues addressed, I'll get the patch out
>>                      later today or
>>                      tomorrow at the latest.
>>
>>
>>                  Thanks a lot Bruce.
>>                  --
>>                  To unsubscribe from this list: send the line
>>                  "unsubscribe netdev" in
>>                  the body of a message to majordomo@vger.kernel.org
>>                  <mailto:majordomo@vger.kernel.org>
>>                  More majordomo info at
>>                  http://vger.kernel.org/__majordomo-info.html
>>                  <http://vger.kernel.org/majordomo-info.html>
>>
>>
>>
>>              Hi, Bruce,
>>
>>                    I tested the TCP friends, found a bug here:
>>
>>              [  106.541372] Pid: 1765, comm: client Not tainted
>>              3.7.0-rc1+ #25
>>              [  106.543121] Call Trace:
>>              [  106.543950]  [<ffffffff8133d212>]
>>              inet_sock_destruct+0x102/0x1f0
>>              [  106.545687]  [<ffffffff812c38ad>] __sk_free+0x1d/0x110
>>              [  106.547209]  [<ffffffff812c3a1c>] sk_free+0x1c/0x20
>>              [  106.548611]  [<ffffffff8131680c>] tcp_close+0x6c/0x3f0
>>              [  106.549863]  [<ffffffff8133caea>] inet_release+0xda/0xf0
>>              [  106.551134]  [<ffffffff8133ca30>] ? inet_release+0x20/0xf0
>>              [  106.552419]  [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10
>>              [  106.553658]  [<ffffffff812bf948>] sock_release+0x28/0xa0
>>              [  106.557366]  [<ffffffff812bfd69>] sock_close+0x29/0x30
>>              [  106.558831]  [<ffffffff81128972>] __fput+0x122/0x210
>>              [  106.560541]  [<ffffffff81128a6e>] ____fput+0xe/0x10
>>              [  106.562006]  [<ffffffff8105354e>] task_work_run+0x9e/0xd0
>>              [  106.563285]  [<ffffffff810027e1>] do_notify_resume+0x61/0x70
>>              [  106.564582]  [<ffffffff8138a908>] int_signal+0x12/0x17
>>
>>
>>                    I also backported and tested it on stable kernel
>>              3.7.3/RHEL6
>>              kernel, this bug still exists. It seem that client may close
>>              listening
>>              sockets, may we need to add one reference count for listen
>>              socket
>>              before send its address to peer?
>>
>>
>>          Sorry, I lost an important line of kernel log before above them:
>>
>>          [  106.539367] IPv4: Attempt to release TCP socket in state 10
>>          ffff880074abb5c0
>>
>>          BTW: state 10 = TCP_LISTEN
>>
>>
>>      It seem this patch works for me.
>>
>>      diff --git a/net/ipv4/inet_connection_sock.c
>>      b/net/ipv4/inet_connection_sock.c
>>      index 9641215..a625c02 100644
>>      --- a/net/ipv4/inet_connection_sock.c
>>      +++ b/net/ipv4/inet_connection_sock.c
>>      @@ -623,8 +623,11 @@ struct sock *inet_csk_clone(struct sock *sk,
>>      const struct request_sock *req,
>>                               sock_hold(newsk);
>>                               was = xchg(&req->friend->sk_friend, newsk);
>>                               /* If requester already connect()ed, maybe
>>      sleeping */
>>      -                       if (was && !sock_flag(req->friend, SOCK_DEAD))
>>      -                               sk->sk_state_change(req->friend);
>>      +                       if (was) {
>>      +                               if (!sock_flag(req->friend, SOCK_DEAD))
>>      +
>>      sk->sk_state_change(req->friend);
>>      +                               sock_put(was);
>>      +                       }
>>                       }
>>                       newsk->sk_state = TCP_SYN_RECV;
>>      diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>>      index 5917485..7a63245 100644
>>      --- a/net/ipv4/tcp_output.c
>>      +++ b/net/ipv4/tcp_output.c
>>      @@ -2277,8 +2277,10 @@ struct sk_buff *tcp_make_synack(struct sock
>>      *sk, struct dst_entry *dst,
>>               memset(&opts, 0, sizeof(opts));
>>               /* Only try to make friends if enabled */
>>      -       if (sysctl_tcp_friends)
>>      +       if (sysctl_tcp_friends) {
>>      +               sock_hold(sk);
>>                       skb->friend = sk;
>>      +       }
>>        #ifdef CONFIG_SYN_COOKIES
>>               if (unlikely(req->cookie_ts))
>>
>>
>>                    And, our TCP friends v4? :)
>>
>>                    Thanks
>>
>>              Yu
>>
>>
>>
> 

  reply	other threads:[~2013-01-23  6:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-03 19:48 v3 for tcp friends? David Miller
2012-09-04 15:10 ` Bruce Curtis
2012-09-04 16:58   ` David Miller
2013-01-21  7:29     ` Li Yu
2013-01-21  7:31       ` Li Yu
     [not found]         ` <CA+WLrf8xuJ3UXK-tZQykuTPMXu67WsNKWNuNzBKk7MBAidW-CQ@mail.gmail.com>
     [not found]           ` <CAEkNxbH0MAAU9oiwwrFTbMJP1yzfNdxD5NDdZOqGnvhTLvodoQ@mail.gmail.com>
2013-01-23  3:14             ` Li Yu
2013-01-23  6:12               ` Li Yu [this message]
2013-01-23  6:46                 ` Eric Dumazet
2013-01-23  7:21                   ` Li Yu
2013-01-23  7:58                     ` Li Yu
2013-01-23  9:39                       ` Li Yu
2013-01-23  9:52                         ` Li Yu
2013-01-23 13:16                           ` Weiping Pan
2013-01-24  2:04                         ` Xiaotian Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50FF7F64.2060902@gmail.com \
    --to=raise.sail@gmail.com \
    --cc=brutus@google.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.