netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Mat Martineau <mathew.j.martineau@linux.intel.com>,
	netdev@vger.kernel.org
Cc: Paolo Abeni <pabeni@redhat.com>, kuba@kernel.org, mptcp@lists.01.org
Subject: Re: [PATCH net-next 10/10] mptcp: refine MPTCP-level ack scheduling
Date: Mon, 23 Nov 2020 12:57:17 +0100	[thread overview]
Message-ID: <ca0b65f8-7a69-ff4e-9e0d-66a7a923b0c1@gmail.com> (raw)
In-Reply-To: <20201119194603.103158-11-mathew.j.martineau@linux.intel.com>



On 11/19/20 8:46 PM, Mat Martineau wrote:
> From: Paolo Abeni <pabeni@redhat.com>
> 
> Send timely MPTCP-level ack is somewhat difficult when
> the insertion into the msk receive level is performed
> by the worker.
> 
> It needs TCP-level dup-ack to notify the MPTCP-level
> ack_seq increase, as both the TCP-level ack seq and the
> rcv window are unchanged.
> 
> We can actually avoid processing incoming data with the
> worker, and let the subflow or recevmsg() send ack as needed.
> 
> When recvmsg() moves the skbs inside the msk receive queue,
> the msk space is still unchanged, so tcp_cleanup_rbuf() could
> end-up skipping TCP-level ack generation. Anyway, when
> __mptcp_move_skbs() is invoked, a known amount of bytes is
> going to be consumed soon: we update rcv wnd computation taking
> them in account.
> 
> Additionally we need to explicitly trigger tcp_cleanup_rbuf()
> when recvmsg() consumes a significant amount of the receive buffer.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
>  net/mptcp/options.c  |   1 +
>  net/mptcp/protocol.c | 105 +++++++++++++++++++++----------------------
>  net/mptcp/protocol.h |   8 ++++
>  net/mptcp/subflow.c  |   4 +-
>  4 files changed, 61 insertions(+), 57 deletions(-)
> 
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 248e3930c0cb..8a59b3e44599 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -530,6 +530,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
>  		opts->ext_copy.ack64 = 0;
>  	}
>  	opts->ext_copy.use_ack = 1;
> +	WRITE_ONCE(msk->old_wspace, __mptcp_space((struct sock *)msk));
>  
>  	/* Add kind/length/subtype/flag overhead if mapping is not populated */
>  	if (dss_size == 0)
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 4ae2c4a30e44..748343f1a968 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -407,16 +407,42 @@ static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
>  	mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
>  }
>  
> -static void mptcp_send_ack(struct mptcp_sock *msk)
> +static bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
> +{
> +	struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> +	/* can't send if JOIN hasn't completed yet (i.e. is usable for mptcp) */
> +	if (subflow->request_join && !subflow->fully_established)
> +		return false;
> +
> +	/* only send if our side has not closed yet */
> +	return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT));
> +}
> +
> +static void mptcp_send_ack(struct mptcp_sock *msk, bool force)
>  {
>  	struct mptcp_subflow_context *subflow;
> +	struct sock *pick = NULL;
>  
>  	mptcp_for_each_subflow(msk, subflow) {
>  		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>  
> -		lock_sock(ssk);
> -		tcp_send_ack(ssk);
> -		release_sock(ssk);
> +		if (force) {
> +			lock_sock(ssk);
> +			tcp_send_ack(ssk);
> +			release_sock(ssk);
> +			continue;
> +		}
> +
> +		/* if the hintes ssk is still active, use it */
> +		pick = ssk;
> +		if (ssk == msk->ack_hint)
> +			break;
> +	}
> +	if (!force && pick) {
> +		lock_sock(pick);
> +		tcp_cleanup_rbuf(pick, 1);

Calling tcp_cleanup_rbuf() on a socket that was never established is going to fail
with a divide by 0 (mss being 0)

AFAIK, mptcp_recvmsg() can be called right after a socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP)
call.

Probably, after a lock_sock(), you should double check socket state (same above before calling tcp_send_ack())



> +		release_sock(pick);
>  	}
>  }
>  


....

>  
> +		/* be sure to advertise window change */
> +		old_space = READ_ONCE(msk->old_wspace);
> +		if ((tcp_space(sk) - old_space) >= old_space)
> +			mptcp_send_ack(msk, false);
> +

Yes, if we call recvmsg() right after socket(), we will end up calling tcp_cleanup_rbuf(),
while no byte was ever copied/drained.


  reply	other threads:[~2020-11-23 11:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-19 19:45 [PATCH net-next 00/10] mptcp: More miscellaneous MPTCP fixes Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 01/10] mptcp: drop WORKER_RUNNING status bit Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 02/10] mptcp: fix state tracking for fallback socket Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 03/10] mptcp: skip to next candidate if subflow has unacked data Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 04/10] selftests: mptcp: add link failure test case Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 05/10] mptcp: keep unaccepted MPC subflow into join list Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 06/10] mptcp: change add_addr_signal type Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 07/10] mptcp: send out dedicated ADD_ADDR packet Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 08/10] selftests: mptcp: add ADD_ADDR IPv6 test cases Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 09/10] mptcp: track window announced to peer Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 10/10] mptcp: refine MPTCP-level ack scheduling Mat Martineau
2020-11-23 11:57   ` Eric Dumazet [this message]
2020-11-23 14:21     ` Paolo Abeni
2020-11-20 23:35 ` [PATCH net-next 00/10] mptcp: More miscellaneous MPTCP fixes Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca0b65f8-7a69-ff4e-9e0d-66a7a923b0c1@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=kuba@kernel.org \
    --cc=mathew.j.martineau@linux.intel.com \
    --cc=mptcp@lists.01.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).