From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Gushchin Subject: Re: [PATCH v3 bpf-next 02/14] bpf: introduce cgroup storage maps Date: Fri, 27 Jul 2018 10:12:43 -0700 Message-ID: <20180727171239.GA9128@castle> References: <20180720174558.5829-1-guro@fb.com> <20180720174558.5829-3-guro@fb.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: , , , Alexei Starovoitov To: Daniel Borkmann Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Fri, Jul 27, 2018 at 06:11:31AM +0200, Daniel Borkmann wrote: > On 07/20/2018 07:45 PM, Roman Gushchin wrote: > > This commit introduces BPF_MAP_TYPE_CGROUP_STORAGE maps: > > a special type of maps which are implementing the cgroup storage. > > > > From the userspace point of view it's almost a generic > > hash map with the (cgroup inode id, attachment type) pair > > used as a key. > > > > The only difference is that some operations are restricted: > > 1) a user can't create new entries, > > 2) a user can't remove existing entries. > > > > The lookup from userspace is o(log(n)). > > > > Signed-off-by: Roman Gushchin > > Cc: Alexei Starovoitov > > Cc: Daniel Borkmann > > Acked-by: Martin KaFai Lau > > (First of all sorry for the late review, only limited availability this week > on my side.) Np, thank you for the review! > > > --- > > include/linux/bpf-cgroup.h | 38 +++++ > > include/linux/bpf.h | 1 + > > include/linux/bpf_types.h | 3 + > > include/uapi/linux/bpf.h | 6 + > > kernel/bpf/Makefile | 1 + > > kernel/bpf/local_storage.c | 367 +++++++++++++++++++++++++++++++++++++++++++++ > > kernel/bpf/verifier.c | 12 ++ > > 7 files changed, 428 insertions(+) > > create mode 100644 kernel/bpf/local_storage.c > > > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h > > index 79795c5fa7c3..6b0e7bd4b154 100644 > > --- a/include/linux/bpf-cgroup.h > > +++ b/include/linux/bpf-cgroup.h > > @@ -3,19 +3,39 @@ > > #define _BPF_CGROUP_H > > > > #include > > +#include > > #include > > > [...] > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 15d69b278277..0b089ba4595d 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -5140,6 +5140,14 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) > > return -E2BIG; > > } > > > > + if (map->map_type == BPF_MAP_TYPE_CGROUP_STORAGE && > > + bpf_cgroup_storage_assign(env->prog, map)) { > > + verbose(env, > > + "only one cgroup storage is allowed\n"); > > + fdput(f); > > + return -EBUSY; > > + } > > + > > /* hold the map. If the program is rejected by verifier, > > * the map will be released by release_maps() or it > > * will be used by the valid program until it's unloaded > > @@ -5148,6 +5156,10 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env) > > map = bpf_map_inc(map, false); > > if (IS_ERR(map)) { > > fdput(f); > > + if (map->map_type == > > + BPF_MAP_TYPE_CGROUP_STORAGE) > > + bpf_cgroup_storage_release(env->prog, > > + map); > > I think this behavior is a bit strange, meaning that we would reset the map via > bpf_cgroup_storage_release() in this case, but if we error out and exit in any > later instruction the prior bpf_cgroup_storage_assign() is not undone, meaning > at this point we have no other choice but to destroy the map since any later > BPF prog load with bpf_cgroup_storage_assign() attempt would fail with -EBUSY > even though it's not assigned anywhere, is that correct? Same also on any other > errors along the prog load path. E.g. say, as one example, your verifier buffer > is too small, so any retry with the very same program from loader side would fail > above due to different prog pointers? Yeah, I see... We should call bpf_cgroup_storage_release() from the generic verifier error path. I'll fix this and the leak in the other patch and resend soon. Thanks!