Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Stanislav Fomichev <sdf@fomichev.me>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Stanislav Fomichev <sdf@google.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Petar Penkov <ppenkov@google.com>
Subject: Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector
Date: Wed, 2 Oct 2019 18:43:56 -0700	[thread overview]
Message-ID: <20191003014356.GC3223377@mini-arch> (raw)
In-Reply-To: <CAEf4BzZuEChOL828F91wLxUr3h2yfAkZvhsyoSx18uSFSxOtqw@mail.gmail.com>

On 10/02, Andrii Nakryiko wrote:
> On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > Always use init_net flow dissector BPF program if it's attached and fall
> > back to the per-net namespace one. Also, deny installing new programs if
> > there is already one attached to the root namespace.
> > Users can still detach their BPF programs, but can't attach any
> > new ones (-EPERM).
> >
> > Cc: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  Documentation/bpf/prog_flow_dissector.rst |  3 +++
> >  net/core/flow_dissector.c                 | 11 ++++++++++-
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/bpf/prog_flow_dissector.rst b/Documentation/bpf/prog_flow_dissector.rst
> > index a78bf036cadd..4d86780ab0f1 100644
> > --- a/Documentation/bpf/prog_flow_dissector.rst
> > +++ b/Documentation/bpf/prog_flow_dissector.rst
> > @@ -142,3 +142,6 @@ BPF flow dissector doesn't support exporting all the metadata that in-kernel
> >  C-based implementation can export. Notable example is single VLAN (802.1Q)
> >  and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
> >  for a set of information that's currently can be exported from the BPF context.
> > +
> > +When BPF flow dissector is attached to the root network namespace (machine-wide
> > +policy), users can't override it in their child network namespaces.
> > diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> > index 7c09d87d3269..494e2016fe84 100644
> > --- a/net/core/flow_dissector.c
> > +++ b/net/core/flow_dissector.c
> > @@ -115,6 +115,11 @@ int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
> >         struct bpf_prog *attached;
> >         struct net *net;
> >
> > +       if (rcu_access_pointer(init_net.flow_dissector_prog)) {
> > +               /* Can't override root flow dissector program */
> > +               return -EPERM;
> > +       }
> 
> This is racy, shouldn't this be checked after grabbing a lock below?
What kind of race do you have in mind?

Even if I put this check under the mutex, it's still possible that if
two cpus concurrently start attaching flow dissector programs (i.e. call
sys_bpf(BPF_PROG_ATTACH)) at the same time (one to root ns, the other
to non-root ns), the cpu that is attaching to non-root can grab mutex first,
pass all the checks and attach the prog (higher frequency, tubo boost, etc).

The mutex is there to protect only against concurrent attaches to the
_same_ netns. For the sake of simplicity we have a global one instead
of a mutex per net-ns.

So I'd rather not grab the mutex and keep it simple. Even in there is a
race, in __skb_flow_dissect we always check init_net first.

> > +
> >         net = current->nsproxy->net_ns;
> >         mutex_lock(&flow_dissector_mutex);
> >         attached = rcu_dereference_protected(net->flow_dissector_prog,
> > @@ -910,7 +915,11 @@ bool __skb_flow_dissect(const struct net *net,
> >         WARN_ON_ONCE(!net);
> >         if (net) {
> >                 rcu_read_lock();
> > -               attached = rcu_dereference(net->flow_dissector_prog);
> > +               attached =
> > +                       rcu_dereference(init_net.flow_dissector_prog);
> > +
> > +               if (!attached)
> > +                       attached = rcu_dereference(net->flow_dissector_prog);
> >
> >                 if (attached) {
> >                         struct bpf_flow_keys flow_keys;
> > --
> > 2.23.0.444.g18eeb5a265-goog
> >

next prev parent reply	other threads:[~2019-10-03  1:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 17:33 [PATCH bpf-next 0/2] bpf/flow_dissector: add mode to enforce global BPF flow dissector Stanislav Fomichev
2019-10-02 17:33 ` [PATCH bpf-next 1/2] " Stanislav Fomichev
2019-10-02 20:57   ` Song Liu
2019-10-02 21:31     ` Stanislav Fomichev
2019-10-02 23:29   ` Andrii Nakryiko
2019-10-03  1:43     ` Stanislav Fomichev [this message]
2019-10-03  2:47       ` Andrii Nakryiko
2019-10-03 16:01         ` Stanislav Fomichev
2019-10-03 16:26           ` Andrii Nakryiko
2019-10-03 17:45             ` John Fastabend
2019-10-03 17:58               ` Stanislav Fomichev
2019-10-02 17:33 ` [PATCH bpf-next 2/2] selftests/bpf: add test for BPF flow dissector in the root namespace Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191003014356.GC3223377@mini-arch \
    --to=sdf@fomichev.me \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=ppenkov@google.com \
    --cc=sdf@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.