All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Sitnicki <jakub@cloudflare.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
	kernel-team@cloudflare.com, Andrii Nakryiko <andriin@fb.com>
Subject: Re: [PATCH bpf-next v2 2/3] bpf, netns: Keep attached programs in bpf_prog_array
Date: Wed, 24 Jun 2020 20:37:04 +0200	[thread overview]
Message-ID: <87k0zwmhtb.fsf@cloudflare.com> (raw)
In-Reply-To: <CAEf4BzZYjPL+TmqFfuEFgzm+qUh_T2zHsRZjq-BE+LMu+25=ZQ@mail.gmail.com>

On Wed, Jun 24, 2020 at 08:24 PM CEST, Andrii Nakryiko wrote:
> On Wed, Jun 24, 2020 at 11:16 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> On Wed, Jun 24, 2020 at 07:47 PM CEST, Andrii Nakryiko wrote:
>> > On Wed, Jun 24, 2020 at 10:19 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> >>
>> >> On Tue, Jun 23, 2020 at 12:34 PM CEST, Jakub Sitnicki wrote:
>> >> > Prepare for having multi-prog attachments for new netns attach types by
>> >> > storing programs to run in a bpf_prog_array, which is well suited for
>> >> > iterating over programs and running them in sequence.
>> >> >
>> >> > Because bpf_prog_array is dynamically resized, after this change a
>> >> > potentially blocking memory allocation in bpf(PROG_QUERY) callback can
>> >> > happen, in order to collect program IDs before copying the values to
>> >> > user-space supplied buffer. This forces us to adapt how we protect access
>> >> > to the attached program in the callback. As bpf_prog_array_copy_to_user()
>> >> > helper can sleep, we switch from an RCU read lock to holding a mutex that
>> >> > serializes updaters.
>> >> >
>> >> > To handle bpf(PROG_ATTACH) scenario when we are replacing an already
>> >> > attached program, we introduce a new bpf_prog_array helper called
>> >> > bpf_prog_array_replace_item that will exchange the old program with a new
>> >> > one. bpf-cgroup does away with such helper by computing an index into the
>> >> > array from a program position in an external list of attached
>> >> > programs/links. Such approach fails when a dummy prog is left in the array
>> >> > after a memory allocation failure on link release, but is necessary in
>> >> > bpf-cgroup case because the same BPF program can be present in the array
>> >> > multiple times due to inheritance, and attachment cannot be reliably
>> >> > identified by bpf_prog pointer comparison.
>> >> >
>> >> > No functional changes intended.
>> >> >
>> >> > Acked-by: Andrii Nakryiko <andriin@fb.com>
>> >> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>> >> > ---
>> >> >  include/linux/bpf.h        |   3 +
>> >> >  include/net/netns/bpf.h    |   5 +-
>> >> >  kernel/bpf/core.c          |  20 ++++--
>> >> >  kernel/bpf/net_namespace.c | 137 +++++++++++++++++++++++++++----------
>> >> >  net/core/flow_dissector.c  |  21 +++---
>> >> >  5 files changed, 132 insertions(+), 54 deletions(-)
>> >> >
>> >>
>> >> [...]
>> >>
>> >> > diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c
>> >> > index b951dab2687f..593523a22168 100644
>> >> > --- a/kernel/bpf/net_namespace.c
>> >> > +++ b/kernel/bpf/net_namespace.c
>> >>
>> >> [...]
>> >>
>> >> > @@ -93,8 +108,16 @@ static int bpf_netns_link_update_prog(struct bpf_link *link,
>> >> >               goto out_unlock;
>> >> >       }
>> >> >
>> >> > +     run_array = rcu_dereference_protected(net->bpf.run_array[type],
>> >> > +                                           lockdep_is_held(&netns_bpf_mutex));
>> >> > +     if (run_array)
>> >> > +             ret = bpf_prog_array_replace_item(run_array, link->prog, new_prog);
>> >>
>> >> Thinking about this some more, link update should fail with -EINVAL if
>> >> new_prog already exists in run_array. Same as PROG_ATTACH fails with
>> >> -EINVAL when trying to attach the same prog for the second time.
>> >>
>> >> Otherwise, LINK_UPDATE can lead to having same BPF prog present multiple
>> >> times in the prog_array, once attaching more than one prog gets enabled.
>> >>
>> >> Then we would we end up with the same challenge as bpf-cgroup, that is
>> >> how to find the program index into the prog_array in presence of
>> >> dummy_prog's.
>> >
>> > If you attach 5 different links having the same bpf_prog, it should be
>> > allowed and all five bpf_progs should be attached and called 5 times.
>> > They are independent links, that's the main thing. What specific BPF
>> > program is attached by the link (or later updated to) shouldn't be of
>> > any concern here (relative to other attached links/programs).
>> >
>> > Attaching the same *link* twice shouldn't be allowed, though.
>>
>> Thanks for clarifying. I need to change the approach then:
>>
>>  1) find the prog index into prog_array by iterating the list of links,
>>  2) adjust the index for any dummy progs in prog_array below the index.
>>
>> That might work for bpf-cgroup too.
>>
>
> I thought that's what bpf-cgroup already does... Except right now
> there could be no dummy progs, but if we do non-failing detachment,
> there might be. Then hierarchies can get out of sync and we need to
> handle that nicely. It's not super straightforward, that's why I said
> that it's a nice challenge to consider :)

Now I get it... bpf-cgroup doesn't use bpf_prog_array_delete_safe(). I
was confusing it with what I saw in bpf_trace all along.

> But here we don't have hierarchy, it's a happy place to be in :)

It feels like there are some war stories to tell about bpf-cgroup.

>> The only other alternative I can think of it to copy the prog array to
>> filter out dummy_progs, before replacing the prog on link update.
>
> Probably over-complication. I'd filter dummy progs only on new link
> attachment or detachment. Update can be kept very simple.

OK, I know what I need to do.

[...]

  reply	other threads:[~2020-06-24 18:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-23 10:34 [PATCH bpf-next v2 0/3] bpf, netns: Prepare for multi-prog attachment Jakub Sitnicki
2020-06-23 10:34 ` [PATCH bpf-next v2 1/3] flow_dissector: Pull BPF program assignment up to bpf-netns Jakub Sitnicki
2020-06-23 10:34 ` [PATCH bpf-next v2 2/3] bpf, netns: Keep attached programs in bpf_prog_array Jakub Sitnicki
2020-06-23 19:33   ` Martin KaFai Lau
2020-06-23 20:59     ` Jakub Sitnicki
2020-06-23 21:24       ` Martin KaFai Lau
2020-06-24 17:33         ` Jakub Sitnicki
2020-06-24 17:18   ` Jakub Sitnicki
2020-06-24 17:47     ` Andrii Nakryiko
2020-06-24 18:13       ` Jakub Sitnicki
2020-06-24 18:24         ` Andrii Nakryiko
2020-06-24 18:37           ` Jakub Sitnicki [this message]
2020-06-23 10:34 ` [PATCH bpf-next v2 3/3] bpf, netns: Keep a list of attached bpf_link's Jakub Sitnicki
2020-06-29 14:56 ` [PATCH bpf-next v2 0/3] bpf, netns: Prepare for multi-prog attachment Daniel Borkmann
2020-06-29 14:57   ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k0zwmhtb.fsf@cloudflare.com \
    --to=jakub@cloudflare.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.