* [MPTCP] Re: [PATCH mptcp-next] mptcp: adjust mptcp receive buffer limit if subflow has larger one
@ 2020-08-14 19:58 Mat Martineau
0 siblings, 0 replies; 2+ messages in thread
From: Mat Martineau @ 2020-08-14 19:58 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 2405 bytes --]
On Thu, 13 Aug 2020, Florian Westphal wrote:
> In addition to tcp autotuning during read, it may also increase the
> receive buffer in tcp_clamp_window().
>
> In this case, mptcp should adjust its receive buffer size as well so
> it can pull all pending skbs at once.
>
> At this time, when TCP grows its receive buffer, it may have more
> skbs ready for processing than what mptcp allows.
>
> In the mptcp case, the receive window is derived from free
> space of the mptcp parent socket instead of the individual subflows.
> Following the subflow allows mptcp to grow its receive buffer.
>
> This is especially noticable for loopback traffic, when even two
> skbs are enough to fill the initial receive window.
>
> Signed-off-by: Florian Westphal <fw(a)strlen.de>
> ---
> net/mptcp/protocol.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index c800b9147a3c..d9307a3e1e62 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -603,6 +603,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
> {
> struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> struct mptcp_sock *msk = mptcp_sk(sk);
> + int sk_rbuf, ssk_rbuf;
> bool wake;
>
> /* move_skbs_to_msk below can legitly clear the data_avail flag,
> @@ -613,12 +614,18 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
> if (wake)
> set_bit(MPTCP_DATA_READY, &msk->flags);
>
> - if (atomic_read(&sk->sk_rmem_alloc) < READ_ONCE(sk->sk_rcvbuf) &&
> - move_skbs_to_msk(msk, ssk))
> + sk_rbuf = READ_ONCE(sk->sk_rcvbuf);
> + ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf);
> + if (ssk_rbuf > sk_rbuf) {
> + WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf);
Any concern about this racing with updates to sk->sk_rcvbuf assignment
(with lock held) in mptcp_rcv_space_adjust() or other subflows
concurrently calling mptcp_data_ready?
> + sk_rbuf = ssk_rbuf;
> + }
> +
> + /* over limit? can't append more skbs to msk */
> + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf)
> goto wake;
>
> - /* don't schedule if mptcp sk is (still) over limit */
> - if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf))
> + if (move_skbs_to_msk(msk, ssk))
> goto wake;
>
> /* mptcp socket is owned, release_cb should retry */
> --
> 2.26.2
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 2+ messages in thread
* [MPTCP] Re: [PATCH mptcp-next] mptcp: adjust mptcp receive buffer limit if subflow has larger one
@ 2020-08-14 22:17 Florian Westphal
0 siblings, 0 replies; 2+ messages in thread
From: Florian Westphal @ 2020-08-14 22:17 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]
Mat Martineau <mathew.j.martineau(a)linux.intel.com> wrote:
> > In addition to tcp autotuning during read, it may also increase the
> > receive buffer in tcp_clamp_window().
> >
> > In this case, mptcp should adjust its receive buffer size as well so
> > it can pull all pending skbs at once.
> >
> > At this time, when TCP grows its receive buffer, it may have more
> > skbs ready for processing than what mptcp allows.
> >
> > In the mptcp case, the receive window is derived from free
> > space of the mptcp parent socket instead of the individual subflows.
> > Following the subflow allows mptcp to grow its receive buffer.
> >
> > This is especially noticable for loopback traffic, when even two
> > skbs are enough to fill the initial receive window.
> >
> > Signed-off-by: Florian Westphal <fw(a)strlen.de>
> > ---
> > net/mptcp/protocol.c | 15 +++++++++++----
> > 1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index c800b9147a3c..d9307a3e1e62 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -603,6 +603,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
> > {
> > struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
> > struct mptcp_sock *msk = mptcp_sk(sk);
> > + int sk_rbuf, ssk_rbuf;
> > bool wake;
> >
> > /* move_skbs_to_msk below can legitly clear the data_avail flag,
> > @@ -613,12 +614,18 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk)
> > if (wake)
> > set_bit(MPTCP_DATA_READY, &msk->flags);
> >
> > - if (atomic_read(&sk->sk_rmem_alloc) < READ_ONCE(sk->sk_rcvbuf) &&
> > - move_skbs_to_msk(msk, ssk))
> > + sk_rbuf = READ_ONCE(sk->sk_rcvbuf);
> > + ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf);
> > + if (ssk_rbuf > sk_rbuf) {
> > + WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf);
>
> Any concern about this racing with updates to sk->sk_rcvbuf assignment (with
> lock held) in mptcp_rcv_space_adjust() or other subflows concurrently
> calling mptcp_data_ready?
THe only option I see ATM is to move the adjustment to when we move skbs
from subflow to mptcp socket, since thats the only spot where we hold
both mptcp and subflow socket locks.
> > + sk_rbuf = ssk_rbuf;
> > + }
> > +
> > + /* over limit? can't append more skbs to msk */
> > + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf)
> > goto wake;
To avoid this from short-ciruiting early, what about calculating
the diff of ssk rcvbuf vs. mpcp rcvbuf space and use that as a
"cushion"?
I think that this would avoid any racing access without adding
complexity.
I'll give it a try.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-08-14 22:17 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-14 22:17 [MPTCP] Re: [PATCH mptcp-next] mptcp: adjust mptcp receive buffer limit if subflow has larger one Florian Westphal
-- strict thread matches above, loose matches on Subject: below --
2020-08-14 19:58 Mat Martineau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.