From: Louis DeLosSantos <louis.delos.devel@gmail.com>
To: Yonghong Song <yhs@meta.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org
Cc: Martin KaFai Lau <martin.lau@linux.dev>,
Stanislav Fomichev <sdf@google.com>,
razor@blackwall.org
Subject: Re: [PATCH 1/2] bpf: add table ID to bpf_fib_lookup BPF helper
Date: Fri, 26 May 2023 10:11:43 -0400 [thread overview]
Message-ID: <ZHC+H7ZBzCN4kPc/@fedora> (raw)
In-Reply-To: <cc91d53d-0a73-cf30-bd7f-b22db8228842@meta.com>
On Thu, May 25, 2023 at 11:01:34PM -0700, Yonghong Song wrote:
>
>
> On 5/25/23 7:27 AM, Louis DeLosSantos wrote:
> > Add ability to specify routing table ID to the `bpf_fib_lookup` BPF
> > helper.
> >
> > A new field `tbid` is added to `struct bpf_fib_lookup` used as
> > parameters to the `bpf_fib_lookup` BPF helper.
> >
> > When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` flag and the
> > `tbid` field in `struct bpf_fib_lookup` is greater then 0, the `tbid`
> > field will be used as the table ID for the fib lookup.
>
> I think table id 0 is legal in the kernel, right?
> It is probably okay to consider table id 0 not supported to
> simplify the user interface. But it would be great to
> add some explanations in the commit message.
>
> >
> > If the `tbid` does not exist the fib lookup will fail with
> > `BPF_FIB_LKUP_RET_NOT_FWDED`.
> >
> > The `tbid` field becomes a union over the vlan related output fields in
> > `struct bpf_fib_lookup` and will be zeroed immediately after usage.
> >
> > This functionality is useful in containerized environments.
> >
> > For instance, if a CNI wants to dictate the next-hop for traffic leaving
> > a container it can create a container-specific routing table and perform
> > a fib lookup against this table in a "host-net-namespace-side" TC program.
> >
> > This functionality also allows `ip rule` like functionality at the TC
> > layer, allowing an eBPF program to pick a routing table based on some
> > aspect of the sk_buff.
> >
> > As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN
> > datapath.
> >
> > When egress traffic leaves a Pod an eBPF program attached by Cilium will
> > determine which VRF the egress traffic should target, and then perform a
> > FIB lookup in a specific table representing this VRF's FIB.
> >
> > Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
> > ---
> > include/uapi/linux/bpf.h | 17 ++++++++++++++---
> > net/core/filter.c | 12 ++++++++++++
> > tools/include/uapi/linux/bpf.h | 17 ++++++++++++++---
> > 3 files changed, 40 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 1bb11a6ee6676..2096fbb328a9b 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3167,6 +3167,8 @@ union bpf_attr {
> > * **BPF_FIB_LOOKUP_DIRECT**
> > * Do a direct table lookup vs full lookup using FIB
> > * rules.
> > + * If *params*->tbid is non-zero, this value designates
> > + * a routing table ID to perform the lookup against.
> > * **BPF_FIB_LOOKUP_OUTPUT**
> > * Perform lookup from an egress perspective (default is
> > * ingress).
> > @@ -6881,9 +6883,18 @@ struct bpf_fib_lookup {
> > __u32 ipv6_dst[4]; /* in6_addr; network order */
> > };
> > - /* output */
> > - __be16 h_vlan_proto;
> > - __be16 h_vlan_TCI;
> > + union {
> > + struct {
> > + /* output */
> > + __be16 h_vlan_proto;
> > + __be16 h_vlan_TCI;
> > + };
> > + /* input: when accompanied with the 'BPF_FIB_LOOKUP_DIRECT` flag, a
> > + * specific routing table to use for the fib lookup.
> > + */
> > + __u32 tbid;
> > + };
> > +
> > __u8 smac[6]; /* ETH_ALEN */
> > __u8 dmac[6]; /* ETH_ALEN */
> > };
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 451b0ec7f2421..6f710aa0a54b3 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5803,6 +5803,12 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
> > u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
> > struct fib_table *tb;
> > + if (params->tbid) {
> > + tbid = params->tbid;
> > + /* zero out for vlan output */
> > + params->tbid = 0;
> > + }
> > +
> > tb = fib_get_table(net, tbid);
> > if (unlikely(!tb))
> > return BPF_FIB_LKUP_RET_NOT_FWDED;
> > @@ -5936,6 +5942,12 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
> > u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
> > struct fib6_table *tb;
> > + if (params->tbid) {
> > + tbid = params->tbid;
> > + /* zero out for vlan output */
> > + params->tbid = 0;
> > + }
> > +
> > tb = ipv6_stub->fib6_get_table(net, tbid);
> > if (unlikely(!tb))
> > return BPF_FIB_LKUP_RET_NOT_FWDED;
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 1bb11a6ee6676..2096fbb328a9b 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -3167,6 +3167,8 @@ union bpf_attr {
> > * **BPF_FIB_LOOKUP_DIRECT**
> > * Do a direct table lookup vs full lookup using FIB
> > * rules.
> > + * If *params*->tbid is non-zero, this value designates
> > + * a routing table ID to perform the lookup against.
> > * **BPF_FIB_LOOKUP_OUTPUT**
> > * Perform lookup from an egress perspective (default is
> > * ingress).
> > @@ -6881,9 +6883,18 @@ struct bpf_fib_lookup {
> > __u32 ipv6_dst[4]; /* in6_addr; network order */
> > };
> > - /* output */
> > - __be16 h_vlan_proto;
> > - __be16 h_vlan_TCI;
> > + union {
> > + struct {
> > + /* output */
> > + __be16 h_vlan_proto;
> > + __be16 h_vlan_TCI;
> > + };
> > + /* input: when accompanied with the 'BPF_FIB_LOOKUP_DIRECT` flag, a
> > + * specific routing table to use for the fib lookup.
> > + */
> > + __u32 tbid;
> > + };
> > +
> > __u8 smac[6]; /* ETH_ALEN */
> > __u8 dmac[6]; /* ETH_ALEN */
> > };
> >
> I think table id 0 is legal in the kernel, right?
> It is probably okay to consider table id 0 not supported to
> simplify the user interface. But it would be great to
> add some explanations in the commit message.
Agreed.
My initial feelings were there is no real use case to query against the Kernel's
`all` table.
The response from John will dictate if this remains the case, as the suggestion
of using a new flag bit will nullify this issue, I think.
If it stays tho, I will def add details in the commit message around this on next
rev.
next prev parent reply other threads:[~2023-05-26 14:11 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-25 14:27 [PATCH 0/2] bpf: utilize table ID in bpf_fib_lookup helper Louis DeLosSantos
2023-05-25 14:27 ` [PATCH 1/2] bpf: add table ID to bpf_fib_lookup BPF helper Louis DeLosSantos
2023-05-26 6:01 ` Yonghong Song
2023-05-26 14:11 ` Louis DeLosSantos [this message]
2023-05-26 6:48 ` John Fastabend
2023-05-26 14:07 ` Louis DeLosSantos
2023-05-26 16:58 ` Yonghong Song
2023-05-25 14:28 ` [PATCH 2/2] bpf: test table ID fib lookup " Louis DeLosSantos
2023-05-26 6:02 ` Yonghong Song
2023-05-26 14:23 ` Louis DeLosSantos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZHC+H7ZBzCN4kPc/@fedora \
--to=louis.delos.devel@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=razor@blackwall.org \
--cc=sdf@google.com \
--cc=yhs@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.