From: Jakub Sitnicki <jakub@cloudflare.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com,
Andrii Nakryiko <andriin@fb.com>
Subject: Re: [PATCH bpf-next v2 2/3] bpf, netns: Keep attached programs in bpf_prog_array
Date: Wed, 24 Jun 2020 19:18:42 +0200 [thread overview]
Message-ID: <87o8p8mlfx.fsf@cloudflare.com> (raw)
In-Reply-To: <20200623103459.697774-3-jakub@cloudflare.com>
On Tue, Jun 23, 2020 at 12:34 PM CEST, Jakub Sitnicki wrote:
> Prepare for having multi-prog attachments for new netns attach types by
> storing programs to run in a bpf_prog_array, which is well suited for
> iterating over programs and running them in sequence.
>
> Because bpf_prog_array is dynamically resized, after this change a
> potentially blocking memory allocation in bpf(PROG_QUERY) callback can
> happen, in order to collect program IDs before copying the values to
> user-space supplied buffer. This forces us to adapt how we protect access
> to the attached program in the callback. As bpf_prog_array_copy_to_user()
> helper can sleep, we switch from an RCU read lock to holding a mutex that
> serializes updaters.
>
> To handle bpf(PROG_ATTACH) scenario when we are replacing an already
> attached program, we introduce a new bpf_prog_array helper called
> bpf_prog_array_replace_item that will exchange the old program with a new
> one. bpf-cgroup does away with such helper by computing an index into the
> array from a program position in an external list of attached
> programs/links. Such approach fails when a dummy prog is left in the array
> after a memory allocation failure on link release, but is necessary in
> bpf-cgroup case because the same BPF program can be present in the array
> multiple times due to inheritance, and attachment cannot be reliably
> identified by bpf_prog pointer comparison.
>
> No functional changes intended.
>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> include/linux/bpf.h | 3 +
> include/net/netns/bpf.h | 5 +-
> kernel/bpf/core.c | 20 ++++--
> kernel/bpf/net_namespace.c | 137 +++++++++++++++++++++++++++----------
> net/core/flow_dissector.c | 21 +++---
> 5 files changed, 132 insertions(+), 54 deletions(-)
>
[...]
> diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c
> index b951dab2687f..593523a22168 100644
> --- a/kernel/bpf/net_namespace.c
> +++ b/kernel/bpf/net_namespace.c
[...]
> @@ -93,8 +108,16 @@ static int bpf_netns_link_update_prog(struct bpf_link *link,
> goto out_unlock;
> }
>
> + run_array = rcu_dereference_protected(net->bpf.run_array[type],
> + lockdep_is_held(&netns_bpf_mutex));
> + if (run_array)
> + ret = bpf_prog_array_replace_item(run_array, link->prog, new_prog);
Thinking about this some more, link update should fail with -EINVAL if
new_prog already exists in run_array. Same as PROG_ATTACH fails with
-EINVAL when trying to attach the same prog for the second time.
Otherwise, LINK_UPDATE can lead to having same BPF prog present multiple
times in the prog_array, once attaching more than one prog gets enabled.
Then we would we end up with the same challenge as bpf-cgroup, that is
how to find the program index into the prog_array in presence of
dummy_prog's.
> + else
> + ret = -ENOENT;
> + if (ret)
> + goto out_unlock;
> +
> old_prog = xchg(&link->prog, new_prog);
> - rcu_assign_pointer(net->bpf.progs[type], new_prog);
> bpf_prog_put(old_prog);
>
> out_unlock:
[...]
> @@ -217,14 +249,25 @@ int netns_bpf_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog)
> if (ret)
> goto out_unlock;
>
> - attached = rcu_dereference_protected(net->bpf.progs[type],
> - lockdep_is_held(&netns_bpf_mutex));
> + attached = net->bpf.progs[type];
> if (attached == prog) {
> /* The same program cannot be attached twice */
> ret = -EINVAL;
> goto out_unlock;
> }
> - rcu_assign_pointer(net->bpf.progs[type], prog);
> +
> + run_array = rcu_dereference_protected(net->bpf.run_array[type],
> + lockdep_is_held(&netns_bpf_mutex));
> + if (run_array) {
> + ret = bpf_prog_array_replace_item(run_array, attached, prog);
I didn't consider here that there can be a run_array with a dummy_prog
from a link release that failed to allocate memory.
In such case bpf_prog_array_replace_item will fail, while we actually
want to replace the dummy_prog.
The right thing to do is to replace the first item in prog array:
if (run_array) {
WRITE_ONCE(run_array->items[0].prog, prog);
} else {
/* allocate a bpf_prog_array */
}
This leaves just one user of bpf_prog_array_replace_item(), so I think
I'm just going to fold it into its only caller, that is the update_prog
callback.
> + } else {
> + ret = bpf_prog_array_copy(NULL, NULL, prog, &run_array);
> + rcu_assign_pointer(net->bpf.run_array[type], run_array);
> + }
> + if (ret)
> + goto out_unlock;
> +
> + net->bpf.progs[type] = prog;
> if (attached)
> bpf_prog_put(attached);
>
[...]
next prev parent reply other threads:[~2020-06-24 17:18 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-23 10:34 [PATCH bpf-next v2 0/3] bpf, netns: Prepare for multi-prog attachment Jakub Sitnicki
2020-06-23 10:34 ` [PATCH bpf-next v2 1/3] flow_dissector: Pull BPF program assignment up to bpf-netns Jakub Sitnicki
2020-06-23 10:34 ` [PATCH bpf-next v2 2/3] bpf, netns: Keep attached programs in bpf_prog_array Jakub Sitnicki
2020-06-23 19:33 ` Martin KaFai Lau
2020-06-23 20:59 ` Jakub Sitnicki
2020-06-23 21:24 ` Martin KaFai Lau
2020-06-24 17:33 ` Jakub Sitnicki
2020-06-24 17:18 ` Jakub Sitnicki [this message]
2020-06-24 17:47 ` Andrii Nakryiko
2020-06-24 18:13 ` Jakub Sitnicki
2020-06-24 18:24 ` Andrii Nakryiko
2020-06-24 18:37 ` Jakub Sitnicki
2020-06-23 10:34 ` [PATCH bpf-next v2 3/3] bpf, netns: Keep a list of attached bpf_link's Jakub Sitnicki
2020-06-29 14:56 ` [PATCH bpf-next v2 0/3] bpf, netns: Prepare for multi-prog attachment Daniel Borkmann
2020-06-29 14:57 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87o8p8mlfx.fsf@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=andriin@fb.com \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.