netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neal Cardwell <ncardwell@google.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, Yuchung Cheng <ycheng@google.com>,
	Van Jacobson <vanj@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	Nandita Dukkipati <nanditad@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>
Subject: [PATCH v3 net-next 13/16] tcp: allow congestion control to expand send buffer differently
Date: Sun, 18 Sep 2016 18:03:50 -0400	[thread overview]
Message-ID: <1474236233-28511-14-git-send-email-ncardwell@google.com> (raw)
In-Reply-To: <1474236233-28511-1-git-send-email-ncardwell@google.com>

From: Yuchung Cheng <ycheng@google.com>

Currently the TCP send buffer expands to twice cwnd, in order to allow
limited transmits in the CA_Recovery state. This assumes that cwnd
does not increase in the CA_Recovery.

For some congestion control algorithms, like the upcoming BBR module,
if the losses in recovery do not indicate congestion then we may
continue to raise cwnd multiplicatively in recovery. In such cases the
current multiplier will falsely limit the sending rate, much as if it
were limited by the application.

This commit adds an optional congestion control callback to use a
different multiplier to expand the TCP send buffer. For congestion
control modules that do not specificy this callback, TCP continues to
use the previous default of 2.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 include/net/tcp.h    | 2 ++
 net/ipv4/tcp_input.c | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3492041..1aa9628 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -917,6 +917,8 @@ struct tcp_congestion_ops {
 	void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
 	/* suggest number of segments for each skb to transmit (optional) */
 	u32 (*tso_segs_goal)(struct sock *sk);
+	/* returns the multiplier used in tcp_sndbuf_expand (optional) */
+	u32 (*sndbuf_expand)(struct sock *sk);
 	/* get info for inet_diag (optional) */
 	size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
 			   union tcp_cc_info *info);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 17de77d..5af0bf3 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -289,6 +289,7 @@ static bool tcp_ecn_rcv_ecn_echo(const struct tcp_sock *tp, const struct tcphdr
 static void tcp_sndbuf_expand(struct sock *sk)
 {
 	const struct tcp_sock *tp = tcp_sk(sk);
+	const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops;
 	int sndmem, per_mss;
 	u32 nr_segs;
 
@@ -309,7 +310,8 @@ static void tcp_sndbuf_expand(struct sock *sk)
 	 * Cubic needs 1.7 factor, rounded to 2 to include
 	 * extra cushion (application might react slowly to POLLOUT)
 	 */
-	sndmem = 2 * nr_segs * per_mss;
+	sndmem = ca_ops->sndbuf_expand ? ca_ops->sndbuf_expand(sk) : 2;
+	sndmem *= nr_segs * per_mss;
 
 	if (sk->sk_sndbuf < sndmem)
 		sk->sk_sndbuf = min(sndmem, sysctl_tcp_wmem[2]);
-- 
2.8.0.rc3.226.g39d4020

  parent reply	other threads:[~2016-09-18 22:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-18 22:03 [PATCH v3 net-next 00/16] tcp: BBR congestion control algorithm Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 01/16] tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 02/16] lib/win_minmax: windowed min or max estimator Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 03/16] tcp: use windowed min filter library for TCP min_rtt estimation Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 04/16] net_sched: sch_fq: add low_rate_threshold parameter Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 05/16] tcp: switch back to proper tcp_skb_cb size check in tcp_init() Neal Cardwell
2016-09-19 14:37   ` Lance Richardson
2016-09-19 14:41     ` Eric Dumazet
2016-09-18 22:03 ` [PATCH v3 net-next 06/16] tcp: count packets marked lost for a TCP connection Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 07/16] tcp: track data delivery rate " Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 08/16] tcp: track application-limited rate samples Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 09/16] tcp: export data delivery rate Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 10/16] tcp: allow congestion control module to request TSO skb segment count Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 11/16] tcp: export tcp_tso_autosize() and parameterize minimum number of TSO segments Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 12/16] tcp: export tcp_mss_to_mtu() for congestion control modules Neal Cardwell
2016-09-18 22:03 ` Neal Cardwell [this message]
2016-09-18 22:03 ` [PATCH v3 net-next 14/16] tcp: new CC hook to set sending rate with rate_sample in any CA state Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 15/16] tcp: increase ICSK_CA_PRIV_SIZE from 64 bytes to 88 Neal Cardwell
2016-09-18 22:03 ` [PATCH v3 net-next 16/16] tcp_bbr: add BBR congestion control Neal Cardwell
     [not found]   ` <CA++eYdtWkMqT1zk_D00H1TciYb_4+aQ6-96YzG1n_h4LLk663g@mail.gmail.com>
2016-09-19  2:43     ` Neal Cardwell
2016-09-19 20:57   ` Stephen Hemminger
2016-09-19 21:10     ` Eric Dumazet
2016-09-19 21:17       ` Rick Jones
2016-09-19 21:23         ` Eric Dumazet
2016-09-19 23:28       ` Stephen Hemminger
2016-09-19 23:33         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1474236233-28511-14-git-send-email-ncardwell@google.com \
    --to=ncardwell@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=nanditad@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soheil@google.com \
    --cc=vanj@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).