From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============3402296533320811739==" MIME-Version: 1.0 From: Christoph Paasch To: mptcp at lists.01.org Subject: [MPTCP] Re: [PATCH mptcp-next 2/2] mptcp: add receive buffer auto-tuning Date: Tue, 16 Jun 2020 09:09:12 -0700 Message-ID: <20200616160912.GI3048@MacBook-Pro-64.local> In-Reply-To: 20200610115534.17417-3-fw@strlen.de X-Status: X-Keywords: X-UID: 4704 --===============3402296533320811739== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hello, On 06/10/20 - 13:55, Florian Westphal wrote: > When mptcp is used, userspace doesn't read from the tcp (subflow) > socket but from the parent (mptcp) socket receive queue. > = > skbs are moved from the subflow socket to the mptcp rx queue either from > 'data_ready' callback (if mptcp socket can be locked), a work queue, or > the socket receive function. > = > This means tcp_rcv_space_adjust() is never called and thus no receive > buffer size auto-tuning is done. > = > An earlier (not merged) patch added tcp_rcv_space_adjust() calls to the > function that moves skbs from subflow to mptcp socket. > While this enabled autotuning, it also meant tuning was done even if > userspace was reading the mptcp socket very slowly. > = > This adds mptcp_rcv_space_adjust() and calls it after userspace has > read data from the mptcp socket rx queue. > = > Its very similar to tcp_rcv_space_adjust, with two differences: > = > 1. The rtt estimate is the largest one observed on a subflow > 2. The rcvbuf size and window clamp of all subflows is adjusted > to the mptcp-level rcvbuf. > = > Otherwise, we get spurious drops at tcp (subflow) socket level if > the skbs are not moved to the mptcp socket fast enough and reduced > throughput.. > = > Before: > time mptcp_connect.sh -t -f $((4*1024*1024)) -d 300 -l 0.01% -r 0 -e "" -= m mmap > [..] > ns4 MPTCP -> ns3 (10.0.3.2:10108 ) MPTCP (duration 40562ms) [ OK ] > ns4 MPTCP -> ns3 (10.0.3.2:10109 ) TCP (duration 5415ms) [ OK ] > ns4 TCP -> ns3 (10.0.3.2:10110 ) MPTCP (duration 5413ms) [ OK ] > ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP (duration 41331ms) [ OK ] > ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP (duration 5415ms) [ OK ] > ns4 TCP -> ns3 (dead:beef:3::2:10113) MPTCP (duration 5714ms) [ OK ] > Time: 846 seconds > = > After: > ns4 MPTCP -> ns3 (10.0.3.2:10108 ) MPTCP (duration 5417ms) [ OK ] > ns4 MPTCP -> ns3 (10.0.3.2:10109 ) TCP (duration 5429ms) [ OK ] > ns4 TCP -> ns3 (10.0.3.2:10110 ) MPTCP (duration 5418ms) [ OK ] > ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP (duration 5423ms) [ OK ] > ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP (duration 5715ms) [ OK ] > ns4 TCP -> ns3 (dead:beef:3::2:10113) MPTCP (duration 5415ms) [ OK ] > Time: 275 seconds > = > Signed-off-by: Florian Westphal > --- > changes since last version: > - adjust for 'mptcp fallback' patches > - add 'const' qualifier to function argument syzkaller found another divide-by-zero. I don't yet have a real repro generated by syzkaller, but from the logs it looks like it is the simultaneous self-connect case. In Davide's other patch I think we need to call mptcp_rcv_space_init in this hunk: diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index fc3be8ab2168..781836f93a68 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1114,6 +1114,16 @@ static void subflow_state_change(struct sock *sk) __subflow_state_change(sk); + if (subflow_simultaneous_connect(sk)) { + mptcp_do_fallback(sk); + pr_fallback(mptcp_sk(parent)); + subflow->conn_finished =3D 1; + if (inet_sk_state_load(parent) =3D=3D TCP_SYN_SENT) { + inet_sk_state_store(parent, TCP_ESTABLISHED); + parent->sk_state_change(parent); + } + } + /* as recvmsg() does not acquire the subflow socket for ssk selecti= on * a fin packet carrying a DSS can be unnoticed if we don't trigger * the data available machinery here. Christoph --===============3402296533320811739==--