From: Eric Dumazet <eric.dumazet@gmail.com>
To: Mat Martineau <mathew.j.martineau@linux.intel.com>,
netdev@vger.kernel.org
Cc: Paolo Abeni <pabeni@redhat.com>, kuba@kernel.org, mptcp@lists.01.org
Subject: Re: [PATCH net-next 10/10] mptcp: refine MPTCP-level ack scheduling
Date: Mon, 23 Nov 2020 12:57:17 +0100 [thread overview]
Message-ID: <ca0b65f8-7a69-ff4e-9e0d-66a7a923b0c1@gmail.com> (raw)
In-Reply-To: <20201119194603.103158-11-mathew.j.martineau@linux.intel.com>
On 11/19/20 8:46 PM, Mat Martineau wrote:
> From: Paolo Abeni <pabeni@redhat.com>
>
> Send timely MPTCP-level ack is somewhat difficult when
> the insertion into the msk receive level is performed
> by the worker.
>
> It needs TCP-level dup-ack to notify the MPTCP-level
> ack_seq increase, as both the TCP-level ack seq and the
> rcv window are unchanged.
>
> We can actually avoid processing incoming data with the
> worker, and let the subflow or recevmsg() send ack as needed.
>
> When recvmsg() moves the skbs inside the msk receive queue,
> the msk space is still unchanged, so tcp_cleanup_rbuf() could
> end-up skipping TCP-level ack generation. Anyway, when
> __mptcp_move_skbs() is invoked, a known amount of bytes is
> going to be consumed soon: we update rcv wnd computation taking
> them in account.
>
> Additionally we need to explicitly trigger tcp_cleanup_rbuf()
> when recvmsg() consumes a significant amount of the receive buffer.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> ---
> net/mptcp/options.c | 1 +
> net/mptcp/protocol.c | 105 +++++++++++++++++++++----------------------
> net/mptcp/protocol.h | 8 ++++
> net/mptcp/subflow.c | 4 +-
> 4 files changed, 61 insertions(+), 57 deletions(-)
>
> diff --git a/net/mptcp/options.c b/net/mptcp/options.c
> index 248e3930c0cb..8a59b3e44599 100644
> --- a/net/mptcp/options.c
> +++ b/net/mptcp/options.c
> @@ -530,6 +530,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
> opts->ext_copy.ack64 = 0;
> }
> opts->ext_copy.use_ack = 1;
> + WRITE_ONCE(msk->old_wspace, __mptcp_space((struct sock *)msk));
>
> /* Add kind/length/subtype/flag overhead if mapping is not populated */
> if (dss_size == 0)
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 4ae2c4a30e44..748343f1a968 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -407,16 +407,42 @@ static void mptcp_set_timeout(const struct sock *sk, const struct sock *ssk)
> mptcp_sk(sk)->timer_ival = tout > 0 ? tout : TCP_RTO_MIN;
> }
>
> -static void mptcp_send_ack(struct mptcp_sock *msk)
> +static bool mptcp_subflow_active(struct mptcp_subflow_context *subflow)
> +{
> + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
> +
> + /* can't send if JOIN hasn't completed yet (i.e. is usable for mptcp) */
> + if (subflow->request_join && !subflow->fully_established)
> + return false;
> +
> + /* only send if our side has not closed yet */
> + return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT));
> +}
> +
> +static void mptcp_send_ack(struct mptcp_sock *msk, bool force)
> {
> struct mptcp_subflow_context *subflow;
> + struct sock *pick = NULL;
>
> mptcp_for_each_subflow(msk, subflow) {
> struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
>
> - lock_sock(ssk);
> - tcp_send_ack(ssk);
> - release_sock(ssk);
> + if (force) {
> + lock_sock(ssk);
> + tcp_send_ack(ssk);
> + release_sock(ssk);
> + continue;
> + }
> +
> + /* if the hintes ssk is still active, use it */
> + pick = ssk;
> + if (ssk == msk->ack_hint)
> + break;
> + }
> + if (!force && pick) {
> + lock_sock(pick);
> + tcp_cleanup_rbuf(pick, 1);
Calling tcp_cleanup_rbuf() on a socket that was never established is going to fail
with a divide by 0 (mss being 0)
AFAIK, mptcp_recvmsg() can be called right after a socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP)
call.
Probably, after a lock_sock(), you should double check socket state (same above before calling tcp_send_ack())
> + release_sock(pick);
> }
> }
>
....
>
> + /* be sure to advertise window change */
> + old_space = READ_ONCE(msk->old_wspace);
> + if ((tcp_space(sk) - old_space) >= old_space)
> + mptcp_send_ack(msk, false);
> +
Yes, if we call recvmsg() right after socket(), we will end up calling tcp_cleanup_rbuf(),
while no byte was ever copied/drained.
next prev parent reply other threads:[~2020-11-23 11:57 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-19 19:45 [PATCH net-next 00/10] mptcp: More miscellaneous MPTCP fixes Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 01/10] mptcp: drop WORKER_RUNNING status bit Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 02/10] mptcp: fix state tracking for fallback socket Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 03/10] mptcp: skip to next candidate if subflow has unacked data Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 04/10] selftests: mptcp: add link failure test case Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 05/10] mptcp: keep unaccepted MPC subflow into join list Mat Martineau
2020-11-19 19:45 ` [PATCH net-next 06/10] mptcp: change add_addr_signal type Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 07/10] mptcp: send out dedicated ADD_ADDR packet Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 08/10] selftests: mptcp: add ADD_ADDR IPv6 test cases Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 09/10] mptcp: track window announced to peer Mat Martineau
2020-11-19 19:46 ` [PATCH net-next 10/10] mptcp: refine MPTCP-level ack scheduling Mat Martineau
2020-11-23 11:57 ` Eric Dumazet [this message]
2020-11-23 14:21 ` Paolo Abeni
2020-11-20 23:35 ` [PATCH net-next 00/10] mptcp: More miscellaneous MPTCP fixes Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca0b65f8-7a69-ff4e-9e0d-66a7a923b0c1@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=kuba@kernel.org \
--cc=mathew.j.martineau@linux.intel.com \
--cc=mptcp@lists.01.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).