netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Amery Hung <ameryhung@gmail.com>
Cc: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<andrii@kernel.org>, <netdev@vger.kernel.org>,
	<magnus.karlsson@intel.com>, <martin.lau@linux.dev>
Subject: Re: [PATCH bpf-next 2/3] bpf: add kfunc for skb refcounting
Date: Fri, 21 Feb 2025 21:17:33 +0100	[thread overview]
Message-ID: <Z7jfXXQ06MrlallF@boxer> (raw)
In-Reply-To: <CAMB2axNJjsytoFrYF=PdsOOWE-bbficZa-54C9YHT5JFu5PFBQ@mail.gmail.com>

On Fri, Feb 21, 2025 at 11:11:27AM -0800, Amery Hung wrote:
> On Fri, Feb 21, 2025 at 6:57 AM Maciej Fijalkowski
> <maciej.fijalkowski@intel.com> wrote:
> >
> > On Thu, Feb 20, 2025 at 03:25:03PM -0800, Amery Hung wrote:
> > >
> > >
> > > On 2/20/2025 5:45 AM, Maciej Fijalkowski wrote:
> > > > These have been mostly taken from Amery Hung's work related to bpf qdisc
> > > > implementation. bpf_skb_{acquire,release}() are for increment/decrement
> > > > sk_buff::users whereas bpf_skb_destroy() is called for map entries that
> > > > have not been released and map is being wiped out from system.
> > > >
> > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > ---
> > > >   net/core/filter.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++
> > > >   1 file changed, 62 insertions(+)
> > > >
> > > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > > index 2ec162dd83c4..9bd2701be088 100644
> > > > --- a/net/core/filter.c
> > > > +++ b/net/core/filter.c
> > > > @@ -12064,6 +12064,56 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk,
> > > >   __bpf_kfunc_end_defs();
> > > > +__diag_push();
> > > > +__diag_ignore_all("-Wmissing-prototypes",
> > > > +             "Global functions as their definitions will be in vmlinux BTF");
> > > > +
> > > > +/* bpf_skb_acquire - Acquire a reference to an skb. An skb acquired by this
> > > > + * kfunc which is not stored in a map as a kptr, must be released by calling
> > > > + * bpf_skb_release().
> > > > + * @skb: The skb on which a reference is being acquired.
> > > > + */
> > > > +__bpf_kfunc struct sk_buff *bpf_skb_acquire(struct sk_buff *skb)
> > > > +{
> > > > +   if (refcount_inc_not_zero(&skb->users))
> > > > +           return skb;
> > > > +   return NULL;
> > > > +}
> > > > +
> > > > +/* bpf_skb_release - Release the reference acquired on an skb.
> > > > + * @skb: The skb on which a reference is being released.
> > > > + */
> > > > +__bpf_kfunc void bpf_skb_release(struct sk_buff *skb)
> > > > +{
> > > > +   skb_unref(skb);
> > > > +}
> > > > +
> > > > +/* bpf_skb_destroy - Release an skb reference acquired and exchanged into
> > > > + * an allocated object or a map.
> > > > + * @skb: The skb on which a reference is being released.
> > > > + */
> > > > +__bpf_kfunc void bpf_skb_destroy(struct sk_buff *skb)
> > > > +{
> > > > +   (void)skb_unref(skb);
> > > > +   consume_skb(skb);
> > > > +}
> > > > +
> > > > +__diag_pop();
> > > > +
> > > > +BTF_KFUNCS_START(skb_kfunc_btf_ids)
> > > > +BTF_ID_FLAGS(func, bpf_skb_acquire, KF_ACQUIRE | KF_RET_NULL)
> > > > +BTF_ID_FLAGS(func, bpf_skb_release, KF_RELEASE)
> > > > +BTF_KFUNCS_END(skb_kfunc_btf_ids)
> > > > +
> > > > +static const struct btf_kfunc_id_set skb_kfunc_set = {
> > > > +   .owner = THIS_MODULE,
> > > > +   .set   = &skb_kfunc_btf_ids,
> > > > +};
> > > > +
> > > > +BTF_ID_LIST(skb_kfunc_dtor_ids)
> > > > +BTF_ID(struct, sk_buff)
> > > > +BTF_ID_FLAGS(func, bpf_skb_destroy, KF_RELEASE)
> > > > +
> > > >   int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
> > > >                            struct bpf_dynptr *ptr__uninit)
> > > >   {
> > > > @@ -12117,6 +12167,13 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = {
> > > >   static int __init bpf_kfunc_init(void)
> > > >   {
> > > > +   const struct btf_id_dtor_kfunc skb_kfunc_dtors[] = {
> > > > +           {
> > > > +                   .btf_id       = skb_kfunc_dtor_ids[0],
> > > > +                   .kfunc_btf_id = skb_kfunc_dtor_ids[1]
> > > > +           },
> > > > +   };
> > > > +
> > > >     int ret;
> > > >     ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_skb);
> > > > @@ -12133,6 +12190,11 @@ static int __init bpf_kfunc_init(void)
> > > >     ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp);
> > > >     ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR,
> > > >                                            &bpf_kfunc_set_sock_addr);
> > > > +   ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &skb_kfunc_set);
> > > > +
> > > > +   ret = ret ?: register_btf_id_dtor_kfuncs(skb_kfunc_dtors,
> > > > +                                            ARRAY_SIZE(skb_kfunc_dtors),
> > > > +                                            THIS_MODULE);
> > >
> > > I think we will need to deal with two versions of skb dtors here. Both qdisc
> > > and cls will register dtor associated for skb. The qdisc one just call
> > > kfree_skb(). While only one can exist for a specific btf id in the kernel if
> > > I understand correctly. Is it possible to have one that work
> > > for both use cases?
> >
> > Looking at the current code it seems bpf_find_btf_id() (which
> > btf_parse_kptr() calls) will go through modules and return the first match
> > against sk_buff btf but that's currently a wild guess from my side. So
> > your concern stands as we have no mechanism that would distinguish the
> > dtors for same btf id.
> >
> > I would have to take a deeper look at btf_parse_kptr() and find some way
> > to associate dtor with its module during registering and then use it
> > within btf_find_dtor_kfunc(). Would this be sufficient?
> >
> 
> That might not be enough. Ultimately, if the user configures both
> modules to be built-in, then there is no way to tell where a trusted
> skb kptr in a bpf program is from.

Am I missing something or how are you going to use the kfuncs that are
required for loading skb onto map as kptr without loaded module? Dtors are
owned by same module as corresponding acquire/release kfuncs.

> 
> Two possible ways to solve this:
> 
> 1. Make the dtor be able to tell whether the skb is from qdisc or cls.
> Since we are both in the TC layer, maybe we can use skb->cb for this?
> 
> 2. Associate KF_ACQUIRE kfuncs with the corresponding KF_RELEASE
> kfuncs somehow. Carry this additional information as the kptr
> propagates in the bpf world so that we know which dtor to call. This
> seems to be overly complicated.
> 
> 
> > >
> > > >     return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk);
> > > >   }
> > > >   late_initcall(bpf_kfunc_init);
> > >

  reply	other threads:[~2025-02-21 20:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-20 13:45 [PATCH bpf-next 0/3] bpf: introduce skb refcount kfuncs Maciej Fijalkowski
2025-02-20 13:45 ` [PATCH bpf-next 1/3] bpf: call btf_is_projection_of() conditionally Maciej Fijalkowski
2025-02-20 13:45 ` [PATCH bpf-next 2/3] bpf: add kfunc for skb refcounting Maciej Fijalkowski
2025-02-20 23:25   ` Amery Hung
2025-02-21 14:56     ` Maciej Fijalkowski
2025-02-21 19:11       ` Amery Hung
2025-02-21 20:17         ` Maciej Fijalkowski [this message]
2025-02-21 23:40           ` Amery Hung
2025-02-21 23:57   ` Amery Hung
2025-02-20 13:45 ` [PATCH bpf-next 3/3] selftests: bpf: implement test case for skb kptr map storage Maciej Fijalkowski
2025-02-22  1:55 ` [PATCH bpf-next 0/3] bpf: introduce skb refcount kfuncs Alexei Starovoitov
2025-02-25 14:27   ` Maciej Fijalkowski
2025-02-26 17:30     ` Alexei Starovoitov
2025-02-27 18:40       ` Maciej Fijalkowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7jfXXQ06MrlallF@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=magnus.karlsson@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).