Re: [PATCH bpf-next 2/4] netfilter: add bpf_xdp_flow_offload_lookup kfunc

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
	bpf@vger.kernel.org, pablo@netfilter.org, kadlec@netfilter.org,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netfilter-devel@vger.kernel.org,
	netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, toke@redhat.com, fw@strlen.de,
	hawk@kernel.org, horms@kernel.org, donhunte@redhat.com
Subject: Re: [PATCH bpf-next 2/4] netfilter: add bpf_xdp_flow_offload_lookup kfunc
Date: Fri, 17 May 2024 21:54:23 +0200	[thread overview]
Message-ID: <Zke172XFjqUGTE6O@lore-desk> (raw)
In-Reply-To: <CAP01T76razfX1e7BsMbbyecPF+RjtJYoZifR-Um_BAoyPNOyKg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4153 bytes --]

[...]
> > +       tuplehash = flow_offload_lookup(flow_table, tuple);
> > +       if (!tuplehash)
> > +               return ERR_PTR(-ENOENT);
> 
> This is fine to do, but the caller should catch it using IS_ERR_PTR
> and return NULL.
> BPF side cannot distinguish ERR_PTR from normal pointer, so this will
> cause a bad deref in the program.

ack, I will fix it in v2.

> 
> > +
> > +       flow = container_of(tuplehash, struct flow_offload,
> > +                           tuplehash[tuplehash->tuple.dir]);
> > +       flow_offload_refresh(flow_table, flow, false);
> > +
> > +       return tuplehash;
> > +}
> > +
> > +__bpf_kfunc struct flow_offload_tuple_rhash *
> > +bpf_xdp_flow_offload_lookup(struct xdp_md *ctx,
> > +                           struct bpf_fib_lookup *fib_tuple,
> > +                           u32 fib_tuple__sz)
> 
> Do you think the __sz has the intended effect? It only works when the
> preceding parameter is a void *.
> If you have a type like struct bpf_fib_lookup, I think it should work
> fine without taking a size at all.

ack, I will fix it in v2.

> 
> > +{
> > +       struct xdp_buff *xdp = (struct xdp_buff *)ctx;
> > +       struct flow_offload_tuple tuple = {
> > +               .iifidx = fib_tuple->ifindex,
> > +               .l3proto = fib_tuple->family,
> > +               .l4proto = fib_tuple->l4_protocol,
> > +               .src_port = fib_tuple->sport,
> > +               .dst_port = fib_tuple->dport,
> > +       };
> > +       __be16 proto;
> > +
> > +       switch (fib_tuple->family) {
> > +       case AF_INET:
> > +               tuple.src_v4.s_addr = fib_tuple->ipv4_src;
> > +               tuple.dst_v4.s_addr = fib_tuple->ipv4_dst;
> > +               proto = htons(ETH_P_IP);
> > +               break;
> > +       case AF_INET6:
> > +               tuple.src_v6 = *(struct in6_addr *)&fib_tuple->ipv6_src;
> > +               tuple.dst_v6 = *(struct in6_addr *)&fib_tuple->ipv6_dst;
> > +               proto = htons(ETH_P_IPV6);
> > +               break;
> > +       default:
> > +               return ERR_PTR(-EINVAL);
> 
> Likewise. While you check IS_ERR_VALUE in selftest, direct dereference
> will be allowed by verifier, which would crash the kernel.
> It's better to do something like conntrack kfuncs, where they set
> opts->error when returning NULL, allowing better debugging in case
> lookup fails.

ack, I will fix it in v2.

> 
> > +       }
> > +
> > +       return bpf_xdp_flow_offload_tuple_lookup(xdp->rxq->dev, &tuple, proto);
> > +}
> > +
> > +__diag_pop()
> > +
> > +BTF_KFUNCS_START(nf_ft_kfunc_set)
> > +BTF_ID_FLAGS(func, bpf_xdp_flow_offload_lookup)
> > +BTF_KFUNCS_END(nf_ft_kfunc_set)
> > +
> > +static const struct btf_kfunc_id_set nf_flow_offload_kfunc_set = {
> > +       .owner = THIS_MODULE,
> > +       .set   = &nf_ft_kfunc_set,
> > +};
> > +
> > +int nf_flow_offload_register_bpf(void)
> > +{
> > +       return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP,
> > +                                        &nf_flow_offload_kfunc_set);
> > +}
> 
> We should probably also expose it to skb? We just need net_device, so
> it can work with both XDP and TC progs.
> That would be similar to how we expose conntrack kfuncs to both XDP
> and TC progs.

I think we will get very similar results to sw flowtable in this case,
don't you think?

> 
> > +EXPORT_SYMBOL_GPL(nf_flow_offload_register_bpf);
> > diff --git a/net/netfilter/nf_flow_table_inet.c b/net/netfilter/nf_flow_table_inet.c
> > index 6eef15648b7b0..b13587238eceb 100644
> > --- a/net/netfilter/nf_flow_table_inet.c
> > +++ b/net/netfilter/nf_flow_table_inet.c
> > @@ -98,6 +98,8 @@ static int __init nf_flow_inet_module_init(void)
> >         nft_register_flowtable_type(&flowtable_ipv6);
> >         nft_register_flowtable_type(&flowtable_inet);
> >
> > +       nf_flow_offload_register_bpf();
> > +
> 
> Error checking needed here? Kfunc registration can fail.

ack, I will fix it.

Regards,
Lorenzo

> 
> >         return 0;
> >  }
> >
> > --
> > 2.45.0
> >
> >
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

next prev parent reply	other threads:[~2024-05-17 19:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15 21:12 [PATCH bpf-next 0/4] netfilter: Add the capability to offload flowtable in XDP layer Lorenzo Bianconi
2024-05-15 21:12 ` [PATCH bpf-next 1/4] netfilter: nf_tables: add flowtable map for xdp offload Lorenzo Bianconi
2024-05-15 21:12 ` [PATCH bpf-next 2/4] netfilter: add bpf_xdp_flow_offload_lookup kfunc Lorenzo Bianconi
2024-05-15 21:59   ` Kumar Kartikeya Dwivedi
2024-05-17 19:54     ` Lorenzo Bianconi [this message]
2024-05-15 21:12 ` [PATCH bpf-next 3/4] samples/bpf: Add bpf sample to offload flowtable traffic to xdp Lorenzo Bianconi
2024-05-15 21:12 ` [PATCH bpf-next 4/4] selftests/bpf: Add selftest for bpf_xdp_flow_offload_lookup kfunc Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zke172XFjqUGTE6O@lore-desk \
    --to=lorenzo.bianconi@redhat.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=donhunte@redhat.com \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=lorenzo@kernel.org \
    --cc=memxor@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).