netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Neal Cardwell <ncardwell@google.com>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, Yuchung Cheng <ycheng@google.com>,
	Van Jacobson <vanj@google.com>,
	Nandita Dukkipati <nanditad@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>
Subject: Re: [PATCH v4 net-next 13/16] tcp: allow congestion control to expand send buffer differently
Date: Tue, 20 Sep 2016 10:48:01 -0700	[thread overview]
Message-ID: <20160920104801.67082004@xeon-e3> (raw)
In-Reply-To: <1474342763-16715-14-git-send-email-ncardwell@google.com>

On Mon, 19 Sep 2016 23:39:20 -0400
Neal Cardwell <ncardwell@google.com> wrote:

> From: Yuchung Cheng <ycheng@google.com>
> 
> Currently the TCP send buffer expands to twice cwnd, in order to allow
> limited transmits in the CA_Recovery state. This assumes that cwnd
> does not increase in the CA_Recovery.
> 
> For some congestion control algorithms, like the upcoming BBR module,
> if the losses in recovery do not indicate congestion then we may
> continue to raise cwnd multiplicatively in recovery. In such cases the
> current multiplier will falsely limit the sending rate, much as if it
> were limited by the application.
> 
> This commit adds an optional congestion control callback to use a
> different multiplier to expand the TCP send buffer. For congestion
> control modules that do not specificy this callback, TCP continues to
> use the previous default of 2.
> 
> Signed-off-by: Van Jacobson <vanj@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> Signed-off-by: Nandita Dukkipati <nanditad@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
>  include/net/tcp.h    | 2 ++
>  net/ipv4/tcp_input.c | 4 +++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 3492041..1aa9628 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -917,6 +917,8 @@ struct tcp_congestion_ops {
>  	void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
>  	/* suggest number of segments for each skb to transmit (optional) */
>  	u32 (*tso_segs_goal)(struct sock *sk);
> +	/* returns the multiplier used in tcp_sndbuf_expand (optional) */
> +	u32 (*sndbuf_expand)(struct sock *sk);
>  	/* get info for inet_diag (optional) */
>  	size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
>  			   union tcp_cc_info *info);
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 17de77d..5af0bf3 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -289,6 +289,7 @@ static bool tcp_ecn_rcv_ecn_echo(const struct tcp_sock *tp, const struct tcphdr
>  static void tcp_sndbuf_expand(struct sock *sk)
>  {
>  	const struct tcp_sock *tp = tcp_sk(sk);
> +	const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
>  	int sndmem, per_mss;
>  	u32 nr_segs;
>  
> @@ -309,7 +310,8 @@ static void tcp_sndbuf_expand(struct sock *sk)
>  	 * Cubic needs 1.7 factor, rounded to 2 to include
>  	 * extra cushion (application might react slowly to POLLOUT)
>  	 */
> -	sndmem = 2 * nr_segs * per_mss;
> +	sndmem = ca_ops->sndbuf_expand ? ca_ops->sndbuf_expand(sk) : 2;

You could avoid the conditional (if it mattered) by inheriting a default value
that would mean changing all existing congestion control modules.
So doing it this way makes life easier.

Acked-by: Stephen Hemminger <stephen@networkplumber.org>

  reply	other threads:[~2016-09-20 17:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-20  3:39 [PATCH v4 net-next 00/16] tcp: BBR congestion control algorithm Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 01/16] tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict Neal Cardwell
2016-09-20 16:40   ` Kenneth Klette Jonassen
2016-09-20  3:39 ` [PATCH v4 net-next 02/16] lib/win_minmax: windowed min or max estimator Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 03/16] tcp: use windowed min filter library for TCP min_rtt estimation Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 04/16] net_sched: sch_fq: add low_rate_threshold parameter Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 05/16] tcp: switch back to proper tcp_skb_cb size check in tcp_init() Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 06/16] tcp: count packets marked lost for a TCP connection Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 07/16] tcp: track data delivery rate " Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 08/16] tcp: track application-limited rate samples Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 09/16] tcp: export data delivery rate Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 10/16] tcp: allow congestion control module to request TSO skb segment count Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 11/16] tcp: export tcp_tso_autosize() and parameterize minimum number of TSO segments Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 12/16] tcp: export tcp_mss_to_mtu() for congestion control modules Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 13/16] tcp: allow congestion control to expand send buffer differently Neal Cardwell
2016-09-20 17:48   ` Stephen Hemminger [this message]
2016-09-20 18:43     ` Yuchung Cheng
2016-09-21  9:25     ` David Laight
2016-09-20  3:39 ` [PATCH v4 net-next 14/16] tcp: new CC hook to set sending rate with rate_sample in any CA state Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 15/16] tcp: increase ICSK_CA_PRIV_SIZE from 64 bytes to 88 Neal Cardwell
2016-09-20  3:39 ` [PATCH v4 net-next 16/16] tcp_bbr: add BBR congestion control Neal Cardwell
2016-09-20 18:48   ` Stephen Hemminger
2016-09-20 18:50     ` Yuchung Cheng
2016-09-20 18:50     ` Neal Cardwell
2016-09-21  2:57       ` Neal Cardwell
2016-09-20 23:39   ` Stephen Hemminger
2016-09-20 23:42     ` Neal Cardwell
2016-09-20 23:56       ` Eric Dumazet
2016-09-30 15:42   ` Lawrence Brakmo
2016-09-21  4:44 ` [PATCH v4 net-next 00/16] tcp: BBR congestion control algorithm David Miller
2016-09-21 11:53   ` Neal Cardwell
2016-09-21 12:51     ` Thomas Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160920104801.67082004@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soheil@google.com \
    --cc=vanj@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).