All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Neal Cardwell <ncardwell@google.com>, Lawrence Brakmo <brakmo@fb.com>
Cc: Yuchung Cheng <ycheng@google.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"edumazet@google.com" <edumazet@google.com>,
	"soheil@google.com" <soheil@google.com>
Subject: Re: [PATCH net] bpf: always re-init the congestion control after switching to it
Date: Tue, 23 Jan 2018 11:50:04 -0800	[thread overview]
Message-ID: <1516737004.3715.8.camel@gmail.com> (raw)
In-Reply-To: <CADVnQym8+Z_OXDk7a5tVH7a4HHkXLFQ8AnjJ0qxF6dmC8=tODA@mail.gmail.com>

On Tue, 2018-01-23 at 14:39 -0500, Neal Cardwell wrote:
> On Tue, Jan 23, 2018 at 2:20 PM, Lawrence Brakmo <brakmo@fb.com> wrote:
> > On 1/23/18, 9:30 AM, "Yuchung Cheng" <ycheng@google.com> wrote:
> > 
> >     The original patch that changes TCP's congestion control via eBPF only
> >     re-initializes the new congestion control, if the BPF op is set to an
> >     (invalid) value beyond BPF_SOCK_OPS_NEEDS_ECN. Consequently TCP will
> > 
> > What do you mean by “(invalid) value”?
> > 
> >     run the new congestion control from random states.
> > 
> > This has always been possible with setsockopt, no?
> > 
> >    This patch fixes
> >     the issue by always re-init the congestion control like other means
> >     such as setsockopt and sysctl changes.
> > 
> > The current code re-inits the congestion control when calling
> > tcp_set_congestion_control except when it is called early on (i.e. op <=
> > BPF_SOCK_OPS_NEEDS_ECN). In that case there is no need to re-initialize
> > since it will be initialized later by TCP when the connection is established.
> > 
> > Otherwise, if we always call tcp_reinit_congestion_control we would call
> > tcp_cleanup_congestion_control when the congestion control has not been
> > initialized yet.
> 
> On Tue, Jan 23, 2018 at 2:20 PM, Lawrence Brakmo <brakmo@fb.com> wrote:
> > On 1/23/18, 9:30 AM, "Yuchung Cheng" <ycheng@google.com> wrote:
> > 
> >     The original patch that changes TCP's congestion control via eBPF only
> >     re-initializes the new congestion control, if the BPF op is set to an
> >     (invalid) value beyond BPF_SOCK_OPS_NEEDS_ECN. Consequently TCP will
> > 
> > What do you mean by “(invalid) value”?
> > 
> >     run the new congestion control from random states.
> > 
> > This has always been possible with setsockopt, no?
> > 
> >    This patch fixes
> >     the issue by always re-init the congestion control like other means
> >     such as setsockopt and sysctl changes.
> > 
> > The current code re-inits the congestion control when calling
> > tcp_set_congestion_control except when it is called early on (i.e. op <=
> > BPF_SOCK_OPS_NEEDS_ECN). In that case there is no need to re-initialize
> > since it will be initialized later by TCP when the connection is established.
> > 
> > Otherwise, if we always call tcp_reinit_congestion_control we would call
> > tcp_cleanup_congestion_control when the congestion control has not been
> > initialized yet.
> 
> Interesting. So I wonder if the symptoms we were seeing were due to
> kernels that did not yet have this fix:
> 
>   27204aaa9dc6 ("tcp: uniform the set up of sockets after successful
> connection):
>   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=27204aaa9dc67b833b77179fdac556288ec3a4bf
> 
> Before that fix, there could be TFO passive connections that at SYN time called:
>   tcp_init_congestion_control(child);
> and then:
>   tcp_call_bpf(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
> 
> So that if the CC was switched in the
> BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB handler then there would be no
> init for the new module?


Note that bpf_sock->op can be written by a malicious BPF filter.

So, a malicious filter can switch from Cubic to BBR without re-init,
and bad things can happen.

I do not believe we should trust BPF here.

  reply	other threads:[~2018-01-23 19:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-23 17:30 [PATCH net] bpf: always re-init the congestion control after switching to it Yuchung Cheng
2018-01-23 19:20 ` Lawrence Brakmo
2018-01-23 19:39   ` Neal Cardwell
2018-01-23 19:50     ` Eric Dumazet [this message]
2018-01-23 20:19       ` Lawrence Brakmo
2018-01-23 21:39         ` Yuchung Cheng
2018-01-23 23:25         ` Alexei Starovoitov
2018-01-23 23:30           ` Lawrence Brakmo
2018-01-23 23:36             ` Yuchung Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1516737004.3715.8.camel@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=brakmo@fb.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soheil@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.