From: Jakub Sitnicki <jakub@cloudflare.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
kernel-team@cloudflare.com
Subject: Re: [PATCH bpf-next 2/3] bpf, netns: Keep attached programs in bpf_prog_array
Date: Tue, 23 Jun 2020 12:51:26 +0200 [thread overview]
Message-ID: <87tuz2m4wh.fsf@cloudflare.com> (raw)
In-Reply-To: <CAEf4BzYY8NcmprF-V3SxBgiF0mqNpK-qrymt=wvz6iCON=geiw@mail.gmail.com>
On Tue, Jun 23, 2020 at 08:23 AM CEST, Andrii Nakryiko wrote:
> On Mon, Jun 22, 2020 at 9:04 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>
>> Prepare for having multi-prog attachments for new netns attach types by
>> storing programs to run in a bpf_prog_array, which is well suited for
>> iterating over programs and running them in sequence.
>>
>> Because bpf_prog_array is dynamically resized, after this change a
>> potentially blocking memory allocation in bpf(PROG_QUERY) callback can
>> happen, in order to collect program IDs before copying the values to
>> user-space supplied buffer. This forces us to adapt how we protect access
>> to the attached program in the callback. As bpf_prog_array_copy_to_user()
>> helper can sleep, we switch from an RCU read lock to holding a mutex that
>> serializes updaters.
>>
>> To handle bpf(PROG_ATTACH) scenario when we are replacing an already
>> attached program, we introduce a new bpf_prog_array helper called
>> bpf_prog_array_replace_item that will exchange the old program with a new
>> one. bpf-cgroup does away with such helper by computing an index into the
>> array based on program position in an external list of attached
>> programs/links. Such approach seems fragile, however, when dummy progs can
>> be left in the array after a memory allocation failure on link release.
>
> bpf-cgroup can have the same BPF program present multiple times in the
> effective prog array due to inheritance. It also has strict
> guarantee/requirement about relative order of programs in parent
> cgroup vs child cgroups. For such cases, replacing a BPF program based
> on its pointer is not going to work correctly.
Thanks for the explanation. That did not occur to me. Incorporated it
into the description in v2.
>
> We do need to make sure that cgroup detachment never fails by falling
> back to replacing BPF prog with dummy prog, though. If you are
> interested in a challenge, you are very welcome to do that! :)
I keep a list of tasks for a slow day.
[...]
next prev parent reply other threads:[~2020-06-23 10:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-22 16:02 [PATCH bpf-next 0/3] bpf, netns: Prepare for multi-prog attachment Jakub Sitnicki
2020-06-22 16:02 ` [PATCH bpf-next 1/3] flow_dissector: Pull BPF program assignment up to bpf-netns Jakub Sitnicki
2020-06-23 6:02 ` Andrii Nakryiko
2020-06-22 16:02 ` [PATCH bpf-next 2/3] bpf, netns: Keep attached programs in bpf_prog_array Jakub Sitnicki
2020-06-23 6:23 ` Andrii Nakryiko
2020-06-23 10:51 ` Jakub Sitnicki [this message]
2020-06-22 16:03 ` [PATCH bpf-next 3/3] bpf, netns: Keep a list of attached bpf_link's Jakub Sitnicki
2020-06-23 6:26 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tuz2m4wh.fsf@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=andrii.nakryiko@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.