netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
       [not found] <CAPYxyxJYXA_maME_qnshnFdzwDUkU1PvgDBQMfH32EerfQJZxA@mail.gmail.com>
@ 2017-05-02 15:05 ` Eric Dumazet
       [not found]   ` <CAPYxyxJBPO6W_wjYwXZQ5DS0sFjLh+A_GPSz1qn6qRZkqhMFQQ@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2017-05-02 15:05 UTC (permalink / raw)
  To: 单卫; +Cc: netdev, David Miller

On Tue, 2017-05-02 at 12:57 +0800, 单卫 wrote:
> Current sndbuff/rcvbuff value of new passive connetion inherit from
> listen socket.
> After then, tcp_init_buffer_space() initial them with init_cwnd and 
> tcp_should_expand_sndbuf() adjust them according the new cwnd.
> 
> 
> But, For Operation & Maintenance engineer, can't control
> sndbuff/rcvbuff with 
> tcp_wmem/tcp_rmem sysctl parameter online. They need to migrate flows
> out 
> and migrate them in after restart services and turn sndbuff/rcvfuff
> up.
> 
> 
> This patch is useful for online servcie setting tcp_inherit_buffsize
> with 0.
> By default, keep be consistent whth before.
> 
> 
> Signed-off-by: Shan Wei <shanwei88@gmail.com>
> ---
>  Documentation/networking/ip-sysctl.txt |    8 ++++++++
>  include/net/netns/ipv4.h               |    1 +
>  net/ipv4/sysctl_net_ipv4.c             |    7 +++++++
>  net/ipv4/tcp_ipv4.c                    |    1 +
>  net/ipv4/tcp_minisocks.c               |    6 ++++++
>  5 files changed, 23 insertions(+), 0 deletions(-)
> 
> 
> diff --git a/Documentation/networking/ip-sysctl.txt
> b/Documentation/networking/ip-sysctl.txt
> index 974ab47..3292bbf 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -308,6 +308,14 @@ tcp_frto - INTEGER
>  
>   By default it's enabled with a non-zero value. 0 disables F-RTO.
>  
> +tcp_inherit_buffsize - BOOLEAN
> + For a new passive TCP connection, can use current tcp_wmem/tcp_rmem
> + parameter to set initial snd/rcv buffer size. This is useful for
> online
> + services which no need to be restarted just set it with 0. By
> default,
> + new passive connection inherits snd/rcv buffer size from lister
> socket.
> +
> + Default: 1
> +
>  tcp_invalid_ratelimit - INTEGER
>   Limit the maximal rate for sending duplicate acknowledgments
>   in response to incoming TCP packets that are for an existing
> diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
> index cd686c4..1fc85bd 100644
> --- a/include/net/netns/ipv4.h
> +++ b/include/net/netns/ipv4.h
> @@ -124,6 +124,7 @@ struct netns_ipv4 {
>   int sysctl_tcp_tw_reuse;
>   struct inet_timewait_death_row tcp_death_row;
>   int sysctl_max_syn_backlog;
> + int sysctl_tcp_inherit_buffsize;
>  
>  #ifdef CONFIG_NET_L3_MASTER_DEV
>   int sysctl_udp_l3mdev_accept;
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index 86957e9..58a8bec 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -1078,6 +1078,13 @@ static int
> proc_tfo_blackhole_detect_timeout(struct ctl_table *table,
>   .mode = 0644,
>   .proc_handler = proc_dointvec
>   },
> + {
> + .procname = "tcp_inherit_buffsize",
> + .data = &init_net.ipv4.sysctl_tcp_inherit_buffsize,
> + .maxlen = sizeof(int),
> + .mode = 0644,
> + .proc_handler = proc_dointvec
> + },
>  #ifdef CONFIG_IP_ROUTE_MULTIPATH
>   {
>   .procname = "fib_multipath_use_neigh",
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index cbbafe5..7a8a2bb 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -2457,6 +2457,7 @@ static int __net_init tcp_sk_init(struct net
> *net)
>   net->ipv4.tcp_death_row.hashinfo = &tcp_hashinfo;
>  
>   net->ipv4.sysctl_max_syn_backlog = max(128, cnt / 256);
> + net->ipv4.sysctl_tcp_inherit_buffsize = 1;
>  
>   return 0;
>  fail:
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index 8f6373b..6090d7e 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -474,6 +474,12 @@ struct sock *tcp_create_openreq_child(const
> struct sock *sk,
>   tcp_init_xmit_timers(newsk);
>   newtp->write_seq = newtp->pushed_seq = treq->snt_isn + 1;
>  
> + if (!(sock_net(newsk)->ipv4.sysctl_tcp_inherit_buffsize) &&
> + !(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) {
> + newsk->sk_sndbuf = sock_net(sk)->ipv4.sysctl_tcp_wmem[1];
> + newsk->sk_rcvbuf = sock_net(sk)->ipv4.sysctl_tcp_rmem[1];
> + }
> +

Hi Shan

1) Your patch never reached netdev, because it was sent in HTML format.

2) During Linus merge window, net-next is closed

I am not really convinced that we need this with TCP autotuning anyway.
Initial value of sk_sndbuf and sk_rcvbuf is really a hint.

How often do you really tweak /proc/sys/net/ipv4 files in production.

Please provide more information, like what actual values you change back
and forth.

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
       [not found]   ` <CAPYxyxJBPO6W_wjYwXZQ5DS0sFjLh+A_GPSz1qn6qRZkqhMFQQ@mail.gmail.com>
@ 2017-05-05 13:52     ` Eric Dumazet
  2017-05-08  7:49       ` Shan Wei
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2017-05-05 13:52 UTC (permalink / raw)
  To: 单卫; +Cc: netdev, David Miller

On Fri, 2017-05-05 at 18:22 +0800, 单卫 wrote:
> 
> 
> 2017-05-02 23:05 GMT+08:00 Eric Dumazet <eric.dumazet@gmail.com>:
>         
>         
>         Hi Shan
>         
>         1) Your patch never reached netdev, because it was sent in
>         HTML format.
>         
>         2) During Linus merge window, net-next is closed
>         
>         I am not really convinced that we need this with TCP
>         autotuning anyway.
>         Initial value of sk_sndbuf and sk_rcvbuf is really a hint.
>         
>         How often do you really tweak /proc/sys/net/ipv4 files in
>         production.
> 
> 
> 
Again you posted your reply in HTML, so netdev did not get it.

> 
> Historically, most products have not adjusted any wmem/rmem
> parameters.
> In our experience to adjust the parameters, you can get a better visit
> experience, faster. 
> 
> Now more and more products want to change this default value.
> But for a product, once we adjust the parameters, it is rarely
> adjusted.
> 
> 
> We hope that this patch can provide a capacity: adjust the parameters
> online and quickly take effect.
> 
> Without this patch, it is time consuming for us to adjust the
> parameters and need to migrate flows.
> 
> 
> 
>          
>         Please provide more information, like what actual values you
>         change back
>         and forth.
>         
> 
> 
> There is no a fixed value here. big value, will take up more memory
> resources.
> But we are also willing to sacrifice memory for performance.
> 
> Select the parameter value based on the characteristics of the
> product.
> 
> 
> For example, for 1MB files, we will set buffsize to 1611646 = (1MB /
> 1460) * 2244
> Mss = 1460, SKB_TRUESIZE = 2244
> 
> 
> With TCP autotuning, reach the buffer size until the cwnd equal 359
> from 10.

Looks like a problem with autotuning because you might have HZ=100 or
HZ=250 maybe ?

We want to make autotuning better, so that tweaks like that are no
longer needed.



> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
  2017-05-05 13:52     ` Eric Dumazet
@ 2017-05-08  7:49       ` Shan Wei
  2017-05-08 14:46         ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Shan Wei @ 2017-05-08  7:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller

at 2017/5/5 21:52
> On Fri, 2017-05-05 at 18:22 +0800,  wrote:
>>
>> 2017-05-02 23:05 GMT+08:00 Eric Dumazet <eric.dumazet@gmail.com>:
>>         
>>         
>>         Hi Shan
>>         
>>         1) Your patch never reached netdev, because it was sent in
>>         HTML format.
>>         
>>         2) During Linus merge window, net-next is closed
>>         
>>         I am not really convinced that we need this with TCP
>>         autotuning anyway.
>>         Initial value of sk_sndbuf and sk_rcvbuf is really a hint.
>>         
>>         How often do you really tweak /proc/sys/net/ipv4 files in
>>         production.
>>
>>
>>
> Again you posted your reply in HTML, so netdev did not get it.
Now, fix it with thunderbird instead of gmail web client.
>> Historically, most products have not adjusted any wmem/rmem
>> parameters.
>> In our experience to adjust the parameters, you can get a better visit
>> experience, faster. 
>>
>> Now more and more products want to change this default value.
>> But for a product, once we adjust the parameters, it is rarely
>> adjusted.
>>
>>
>> We hope that this patch can provide a capacity: adjust the parameters
>> online and quickly take effect.
>>
>> Without this patch, it is time consuming for us to adjust the
>> parameters and need to migrate flows.
>>
>>
>>
>>          
>>         Please provide more information, like what actual values you
>>         change back
>>         and forth.
>>         
>>
>>
>> There is no a fixed value here. big value, will take up more memory
>> resources.
>> But we are also willing to sacrifice memory for performance.
>>
>> Select the parameter value based on the characteristics of the
>> product.
>>
>> For example, for 1MB files, we will set buffsize to 1611646 = (1MB /
>> 1460) * 2244
>> Mss = 1460, SKB_TRUESIZE = 2244
>>
>>
>> With TCP autotuning, reach the buffer size until the cwnd equal 359
>> from 10.
> Looks like a problem with autotuning because you might have HZ=100 or
> HZ=250 maybe ?
Indeed,  HZ=250.  What is the problem about autotuning with HZ=250?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion
  2017-05-08  7:49       ` Shan Wei
@ 2017-05-08 14:46         ` Eric Dumazet
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2017-05-08 14:46 UTC (permalink / raw)
  To: Shan Wei; +Cc: netdev, David Miller

On Mon, 2017-05-08 at 15:49 +0800, Shan Wei wrote:

> Indeed,  HZ=250.  What is the problem about autotuning with HZ=250?
> 

You might look at recent patches.

commit 645f4c6f2ebd040688cc2a5f626ffc909e66ccf2
("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps")

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-05-08 14:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAPYxyxJYXA_maME_qnshnFdzwDUkU1PvgDBQMfH32EerfQJZxA@mail.gmail.com>
2017-05-02 15:05 ` [PATCH net-next ] tcp: add new parameter tcp_inherit_buffsize to control the initial buffer size for new passive connetion Eric Dumazet
     [not found]   ` <CAPYxyxJBPO6W_wjYwXZQ5DS0sFjLh+A_GPSz1qn6qRZkqhMFQQ@mail.gmail.com>
2017-05-05 13:52     ` Eric Dumazet
2017-05-08  7:49       ` Shan Wei
2017-05-08 14:46         ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).