Re: [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sdf@google.com
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org
Subject: Re: [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program
Date: Tue, 10 May 2022 10:31:05 -0700	[thread overview]
Message-ID: <YnqhWTshFLqMY9kl@google.com> (raw)
In-Reply-To: <20220510050546.tpuslkld4rlrqexp@MBP-98dd607d3435.dhcp.thefacebook.com>

On 05/09, Alexei Starovoitov wrote:
> On Fri, Apr 29, 2022 at 02:15:34PM -0700, Stanislav Fomichev wrote:
> > Previous patch adds 1:1 mapping between all 211 LSM hooks
> > and bpf_cgroup program array. Instead of reserving a slot per
> > possible hook, reserve 10 slots per cgroup for lsm programs.
> > Those slots are dynamically allocated on demand and reclaimed.
> >
> > It should be possible to eventually extend this idea to all hooks if
> > the memory consumption is unacceptable and shrink overall effective
> > programs array.
> >
> > struct cgroup_bpf {
> > 	struct bpf_prog_array *    effective[33];        /*     0   264 */
> > 	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
> > 	struct hlist_head          progs[33];            /*   264   264 */
> > 	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
> > 	u8                         flags[33];            /*   528    33 */
> >
> > 	/* XXX 7 bytes hole, try to pack */
> >
> > 	struct list_head           storages;             /*   568    16 */
> > 	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
> > 	struct bpf_prog_array *    inactive;             /*   584     8 */
> > 	struct percpu_ref          refcnt;               /*   592    16 */
> > 	struct work_struct         release_work;         /*   608    72 */
> >
> > 	/* size: 680, cachelines: 11, members: 7 */
> > 	/* sum members: 673, holes: 1, sum holes: 7 */
> > 	/* last cacheline: 40 bytes */
> > };
> >
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  include/linux/bpf-cgroup-defs.h |   3 +-
> >  include/linux/bpf_lsm.h         |   6 --
> >  kernel/bpf/bpf_lsm.c            |   5 --
> >  kernel/bpf/cgroup.c             | 107 +++++++++++++++++++++++++++-----
> >  4 files changed, 94 insertions(+), 27 deletions(-)
> >
> > diff --git a/include/linux/bpf-cgroup-defs.h  
> b/include/linux/bpf-cgroup-defs.h
> > index d5a70a35dace..359d3f16abea 100644
> > --- a/include/linux/bpf-cgroup-defs.h
> > +++ b/include/linux/bpf-cgroup-defs.h
> > @@ -10,7 +10,8 @@
> >
> >  struct bpf_prog_array;
> >
> > -#define CGROUP_LSM_NUM 211 /* will be addressed in the next patch */
> > +/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
> > +#define CGROUP_LSM_NUM 10
> >
> >  enum cgroup_bpf_attach_type {
> >  	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
> > diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> > index 7f0e59f5f9be..613de44aa429 100644
> > --- a/include/linux/bpf_lsm.h
> > +++ b/include/linux/bpf_lsm.h
> > @@ -43,7 +43,6 @@ extern const struct bpf_func_proto  
> bpf_inode_storage_delete_proto;
> >  void bpf_inode_storage_free(struct inode *inode);
> >
> >  int bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t  
> *bpf_func);
> > -int bpf_lsm_hook_idx(u32 btf_id);
> >
> >  #else /* !CONFIG_BPF_LSM */
> >
> > @@ -74,11 +73,6 @@ static inline int bpf_lsm_find_cgroup_shim(const  
> struct bpf_prog *prog,
> >  	return -ENOENT;
> >  }
> >
> > -static inline int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -	return -EINVAL;
> > -}
> > -
> >  #endif /* CONFIG_BPF_LSM */
> >
> >  #endif /* _LINUX_BPF_LSM_H */
> > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> > index a0e68ef5dfb1..1079c747e061 100644
> > --- a/kernel/bpf/bpf_lsm.c
> > +++ b/kernel/bpf/bpf_lsm.c
> > @@ -91,11 +91,6 @@ int bpf_lsm_find_cgroup_shim(const struct bpf_prog  
> *prog,
> >  	return 0;
> >  }
> >
> > -int bpf_lsm_hook_idx(u32 btf_id)
> > -{
> > -	return btf_id_set_index(&bpf_lsm_hooks, btf_id);
> > -}
> > -
> >  int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> >  			const struct bpf_prog *prog)
> >  {
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 9cc38454e402..787ff6cf8d42 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> > @@ -79,10 +79,13 @@ unsigned int __cgroup_bpf_run_lsm_sock(const void  
> *ctx,
> >  	shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct  
> bpf_prog, insnsi));
> >
> >  	cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
> > -	if (likely(cgrp))
> > +	if (likely(cgrp)) {
> > +		rcu_read_lock(); /* See bpf_lsm_attach_type_get(). */

> I've looked at bpf_lsm_attach_type_get/put, but still don't get it :)
> shim_prog->aux->cgroup_atype stays the same for the life of shim_prog.
> atype_usecnt will go up and down, but atype_usecnt == 0 is the only
> interesting one from the pov of selecting atype in _get().
> And there shim_prog will be detached and trampoline destroyed.
> The shim_prog->aux->cgroup_atype deref below cannot be happening on
> freed shim_prog.
> So what is the point of this critical section and sync_rcu() ?
> It seems none of it is necessary.

I was trying to guard against the reuse of the same cgroup_atype:

CPU0                                     CPU1
__cgroup_bpf_run_lsm_socket:
atype = shim_prog->aux->cgroup_atype
                                          __cgroup_bpf_detach
                                          bpf_lsm_attach_type_put(shim_prog  
attach_btf_id)
                                          __cgroup_bpf_attach(another hook)
                                          bpf_lsm_attach_type_get(another  
btf_id)
                                          ^^^ can reuse the same cgroup_atype
array = cgrp->effective[atype]
^^^ run effective from another btf_id?

So I added that sync_rcu to wait for existing shim_prog users to exit.
Am I too paranoid? Maybe if I move bpf_lsm_attach_type_put deep into
bpf_prog_put (deferred path) that won't be an issue and we can drop
rcu_sync+read lock?

> > It should be possible to eventually extend this idea to all hooks if
> > the memory consumption is unacceptable and shrink overall effective
> > programs array.

> if BPF_LSM_CGROUP do atype differently looks too special.
> Why not to do this generalization right now?
> Do atype_get for all cgroup hooks and get rid of  
> to_cgroup_bpf_attach_type ?
> Combine ranges of attach_btf_id for lsm_cgroup and enum bpf_attach_type
> for traditional cgroup hooks into single _get() method that returns a slot
> in effective[] array ?
> attach/detach/query apis won't notice this internal implementation detail.

I'm being extra cautious by using this new allocation scheme for LSM only.
If there is no general pushback, I can try to convert everything at the
same time.

next prev parent reply	other threads:[~2022-05-10 17:31 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-29 21:15 [PATCH bpf-next v6 00/10] bpf: cgroup_sock lsm flavor Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 01/10] bpf: add bpf_func_t and trampoline helpers Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 02/10] bpf: convert cgroup_bpf.progs to hlist Stanislav Fomichev
2022-05-18 15:16   ` Jakub Sitnicki
2022-04-29 21:15 ` [PATCH bpf-next v6 03/10] bpf: per-cgroup lsm flavor Stanislav Fomichev
2022-05-06 23:02   ` Martin KaFai Lau
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-10  7:13       ` Martin KaFai Lau
2022-05-10 17:30         ` Stanislav Fomichev
2022-05-10 19:18           ` Martin KaFai Lau
2022-05-10 21:14             ` Stanislav Fomichev
2022-05-09 21:51   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 04/10] bpf: minimize number of allocated lsm slots per program Stanislav Fomichev
2022-05-10  5:05   ` Alexei Starovoitov
2022-05-10 17:31     ` sdf [this message]
2022-05-12  4:07       ` Alexei Starovoitov
2022-04-29 21:15 ` [PATCH bpf-next v6 05/10] bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP Stanislav Fomichev
2022-05-07  0:12   ` Martin KaFai Lau
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-09 21:49   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 06/10] bpf: allow writing to a subset of sock fields from lsm progtype Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 07/10] libbpf: add lsm_cgoup_sock type Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 08/10] bpftool: implement cgroup tree for BPF_LSM_CGROUP Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 09/10] selftests/bpf: lsm_cgroup functional test Stanislav Fomichev
2022-04-29 21:15 ` [PATCH bpf-next v6 10/10] selftests/bpf: verify lsm_cgroup struct sock access Stanislav Fomichev
2022-05-09 21:54   ` Andrii Nakryiko
2022-05-09 23:38     ` Stanislav Fomichev
2022-05-09 23:43       ` Andrii Nakryiko
2022-05-10 17:31         ` Stanislav Fomichev
2022-05-12  3:37           ` Andrii Nakryiko
2022-05-12 17:11             ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnqhWTshFLqMY9kl@google.com \
    --to=sdf@google.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.