netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Neal Cardwell <ncardwell@google.com>, Lawrence Brakmo <brakmo@fb.com>
Cc: Yuchung Cheng <ycheng@google.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"edumazet@google.com" <edumazet@google.com>,
	"soheil@google.com" <soheil@google.com>
Subject: Re: [PATCH net] bpf: always re-init the congestion control after switching to it
Date: Tue, 23 Jan 2018 11:50:04 -0800	[thread overview]
Message-ID: <1516737004.3715.8.camel@gmail.com> (raw)
In-Reply-To: <CADVnQym8+Z_OXDk7a5tVH7a4HHkXLFQ8AnjJ0qxF6dmC8=tODA@mail.gmail.com>

On Tue, 2018-01-23 at 14:39 -0500, Neal Cardwell wrote:
> On Tue, Jan 23, 2018 at 2:20 PM, Lawrence Brakmo <brakmo@fb.com> wrote:
> > On 1/23/18, 9:30 AM, "Yuchung Cheng" <ycheng@google.com> wrote:
> > 
> >     The original patch that changes TCP's congestion control via eBPF only
> >     re-initializes the new congestion control, if the BPF op is set to an
> >     (invalid) value beyond BPF_SOCK_OPS_NEEDS_ECN. Consequently TCP will
> > 
> > What do you mean by “(invalid) value”?
> > 
> >     run the new congestion control from random states.
> > 
> > This has always been possible with setsockopt, no?
> > 
> >    This patch fixes
> >     the issue by always re-init the congestion control like other means
> >     such as setsockopt and sysctl changes.
> > 
> > The current code re-inits the congestion control when calling
> > tcp_set_congestion_control except when it is called early on (i.e. op <=
> > BPF_SOCK_OPS_NEEDS_ECN). In that case there is no need to re-initialize
> > since it will be initialized later by TCP when the connection is established.
> > 
> > Otherwise, if we always call tcp_reinit_congestion_control we would call
> > tcp_cleanup_congestion_control when the congestion control has not been
> > initialized yet.
> 
> On Tue, Jan 23, 2018 at 2:20 PM, Lawrence Brakmo <brakmo@fb.com> wrote:
> > On 1/23/18, 9:30 AM, "Yuchung Cheng" <ycheng@google.com> wrote:
> > 
> >     The original patch that changes TCP's congestion control via eBPF only
> >     re-initializes the new congestion control, if the BPF op is set to an
> >     (invalid) value beyond BPF_SOCK_OPS_NEEDS_ECN. Consequently TCP will
> > 
> > What do you mean by “(invalid) value”?
> > 
> >     run the new congestion control from random states.
> > 
> > This has always been possible with setsockopt, no?
> > 
> >    This patch fixes
> >     the issue by always re-init the congestion control like other means
> >     such as setsockopt and sysctl changes.
> > 
> > The current code re-inits the congestion control when calling
> > tcp_set_congestion_control except when it is called early on (i.e. op <=
> > BPF_SOCK_OPS_NEEDS_ECN). In that case there is no need to re-initialize
> > since it will be initialized later by TCP when the connection is established.
> > 
> > Otherwise, if we always call tcp_reinit_congestion_control we would call
> > tcp_cleanup_congestion_control when the congestion control has not been
> > initialized yet.
> 
> Interesting. So I wonder if the symptoms we were seeing were due to
> kernels that did not yet have this fix:
> 
>   27204aaa9dc6 ("tcp: uniform the set up of sockets after successful
> connection):
>   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=27204aaa9dc67b833b77179fdac556288ec3a4bf
> 
> Before that fix, there could be TFO passive connections that at SYN time called:
>   tcp_init_congestion_control(child);
> and then:
>   tcp_call_bpf(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
> 
> So that if the CC was switched in the
> BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB handler then there would be no
> init for the new module?


Note that bpf_sock->op can be written by a malicious BPF filter.

So, a malicious filter can switch from Cubic to BBR without re-init,
and bad things can happen.

I do not believe we should trust BPF here.

  reply	other threads:[~2018-01-23 19:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-23 17:30 [PATCH net] bpf: always re-init the congestion control after switching to it Yuchung Cheng
2018-01-23 19:20 ` Lawrence Brakmo
2018-01-23 19:39   ` Neal Cardwell
2018-01-23 19:50     ` Eric Dumazet [this message]
2018-01-23 20:19       ` Lawrence Brakmo
2018-01-23 21:39         ` Yuchung Cheng
2018-01-23 23:25         ` Alexei Starovoitov
2018-01-23 23:30           ` Lawrence Brakmo
2018-01-23 23:36             ` Yuchung Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1516737004.3715.8.camel@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=brakmo@fb.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soheil@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).