From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
Louis DeLosSantos <louis.delos.devel@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>
Subject: [PATCH 5.15 80/84] bpf: Add table ID to bpf_fib_lookup BPF helper
Date: Mon, 4 Mar 2024 21:24:53 +0000 [thread overview]
Message-ID: <20240304211545.082544823@linuxfoundation.org> (raw)
In-Reply-To: <20240304211542.332206551@linuxfoundation.org>
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Louis DeLosSantos <louis.delos.devel@gmail.com>
commit 8ad77e72caae22a1ddcfd0c03f2884929e93b7a4 upstream.
Add ability to specify routing table ID to the `bpf_fib_lookup` BPF
helper.
A new field `tbid` is added to `struct bpf_fib_lookup` used as
parameters to the `bpf_fib_lookup` BPF helper.
When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` and
`BPF_FIB_LOOKUP_TBID` flags the `tbid` field in `struct bpf_fib_lookup`
will be used as the table ID for the fib lookup.
If the `tbid` does not exist the fib lookup will fail with
`BPF_FIB_LKUP_RET_NOT_FWDED`.
The `tbid` field becomes a union over the vlan related output fields
in `struct bpf_fib_lookup` and will be zeroed immediately after usage.
This functionality is useful in containerized environments.
For instance, if a CNI wants to dictate the next-hop for traffic leaving
a container it can create a container-specific routing table and perform
a fib lookup against this table in a "host-net-namespace-side" TC program.
This functionality also allows `ip rule` like functionality at the TC
layer, allowing an eBPF program to pick a routing table based on some
aspect of the sk_buff.
As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN
datapath.
When egress traffic leaves a Pod an eBPF program attached by Cilium will
determine which VRF the egress traffic should target, and then perform a
FIB lookup in a specific table representing this VRF's FIB.
Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-1-0a31c22c748c@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/uapi/linux/bpf.h | 21 ++++++++++++++++++---
net/core/filter.c | 14 +++++++++++++-
tools/include/uapi/linux/bpf.h | 21 ++++++++++++++++++---
3 files changed, 49 insertions(+), 7 deletions(-)
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3011,6 +3011,10 @@ union bpf_attr {
* **BPF_FIB_LOOKUP_DIRECT**
* Do a direct table lookup vs full lookup using FIB
* rules.
+ * **BPF_FIB_LOOKUP_TBID**
+ * Used with BPF_FIB_LOOKUP_DIRECT.
+ * Use the routing table ID present in *params*->tbid
+ * for the fib lookup.
* **BPF_FIB_LOOKUP_OUTPUT**
* Perform lookup from an egress perspective (default is
* ingress).
@@ -6046,6 +6050,7 @@ enum {
BPF_FIB_LOOKUP_DIRECT = (1U << 0),
BPF_FIB_LOOKUP_OUTPUT = (1U << 1),
BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2),
+ BPF_FIB_LOOKUP_TBID = (1U << 3),
};
enum {
@@ -6106,9 +6111,19 @@ struct bpf_fib_lookup {
__u32 ipv6_dst[4]; /* in6_addr; network order */
};
- /* output */
- __be16 h_vlan_proto;
- __be16 h_vlan_TCI;
+ union {
+ struct {
+ /* output */
+ __be16 h_vlan_proto;
+ __be16 h_vlan_TCI;
+ };
+ /* input: when accompanied with the
+ * 'BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID` flags, a
+ * specific routing table to use for the fib lookup.
+ */
+ __u32 tbid;
+ };
+
__u8 smac[6]; /* ETH_ALEN */
__u8 dmac[6]; /* ETH_ALEN */
};
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5447,6 +5447,12 @@ static int bpf_ipv4_fib_lookup(struct ne
u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
struct fib_table *tb;
+ if (flags & BPF_FIB_LOOKUP_TBID) {
+ tbid = params->tbid;
+ /* zero out for vlan output */
+ params->tbid = 0;
+ }
+
tb = fib_get_table(net, tbid);
if (unlikely(!tb))
return BPF_FIB_LKUP_RET_NOT_FWDED;
@@ -5580,6 +5586,12 @@ static int bpf_ipv6_fib_lookup(struct ne
u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
struct fib6_table *tb;
+ if (flags & BPF_FIB_LOOKUP_TBID) {
+ tbid = params->tbid;
+ /* zero out for vlan output */
+ params->tbid = 0;
+ }
+
tb = ipv6_stub->fib6_get_table(net, tbid);
if (unlikely(!tb))
return BPF_FIB_LKUP_RET_NOT_FWDED;
@@ -5652,7 +5664,7 @@ set_fwd_params:
#endif
#define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \
- BPF_FIB_LOOKUP_SKIP_NEIGH)
+ BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID)
BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
struct bpf_fib_lookup *, params, int, plen, u32, flags)
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3011,6 +3011,10 @@ union bpf_attr {
* **BPF_FIB_LOOKUP_DIRECT**
* Do a direct table lookup vs full lookup using FIB
* rules.
+ * **BPF_FIB_LOOKUP_TBID**
+ * Used with BPF_FIB_LOOKUP_DIRECT.
+ * Use the routing table ID present in *params*->tbid
+ * for the fib lookup.
* **BPF_FIB_LOOKUP_OUTPUT**
* Perform lookup from an egress perspective (default is
* ingress).
@@ -6046,6 +6050,7 @@ enum {
BPF_FIB_LOOKUP_DIRECT = (1U << 0),
BPF_FIB_LOOKUP_OUTPUT = (1U << 1),
BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2),
+ BPF_FIB_LOOKUP_TBID = (1U << 3),
};
enum {
@@ -6106,9 +6111,19 @@ struct bpf_fib_lookup {
__u32 ipv6_dst[4]; /* in6_addr; network order */
};
- /* output */
- __be16 h_vlan_proto;
- __be16 h_vlan_TCI;
+ union {
+ struct {
+ /* output */
+ __be16 h_vlan_proto;
+ __be16 h_vlan_TCI;
+ };
+ /* input: when accompanied with the
+ * 'BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID` flags, a
+ * specific routing table to use for the fib lookup.
+ */
+ __u32 tbid;
+ };
+
__u8 smac[6]; /* ETH_ALEN */
__u8 dmac[6]; /* ETH_ALEN */
};
next prev parent reply other threads:[~2024-03-04 21:56 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-04 21:23 [PATCH 5.15 00/84] 5.15.151-rc1 review Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 01/84] netfilter: nf_tables: disallow timeout for anonymous sets Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 02/84] mtd: spinand: gigadevice: Fix the get ecc status issue Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 03/84] netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 04/84] net: ip_tunnel: prevent perpetual headroom growth Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 05/84] tun: Fix xdp_rxq_infos queue_index when detaching Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 06/84] cpufreq: intel_pstate: fix pstate limits enforcement for adjust_perf call back Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 07/84] net: veth: clear GRO when clearing XDP even when down Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 08/84] ipv6: fix potential "struct net" leak in inet6_rtm_getaddr() Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 09/84] lan78xx: enable auto speed configuration for LAN7850 if no EEPROM is detected Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 10/84] net: enable memcg accounting for veth queues Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 11/84] veth: try harder when allocating queue memory Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 12/84] net: usb: dm9601: fix wrong return value in dm9601_mdio_read Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 13/84] uapi: in6: replace temporary label with rfc9486 Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 14/84] stmmac: Clear variable when destroying workqueue Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 15/84] Bluetooth: Avoid potential use-after-free in hci_error_reset Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 16/84] Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 17/84] Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 18/84] Bluetooth: Enforce validation on max value of connection interval Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 19/84] netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate() Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 20/84] netfilter: nfnetlink_queue: silence bogus compiler warning Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 21/84] netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 22/84] netfilter: make function op structures const Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 23/84] netfilter: let reset rules clean out conntrack entries Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 24/84] netfilter: bridge: confirm multicast packets before passing them up the stack Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 25/84] rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back Greg Kroah-Hartman
2024-03-04 21:23 ` [PATCH 5.15 26/84] igb: extend PTP timestamp adjustments to i211 Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 27/84] tls: rx: dont store the record type in socket context Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 28/84] tls: rx: dont store the decryption status " Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 29/84] tls: rx: dont issue wake ups when data is decrypted Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 30/84] tls: rx: refactor decrypt_skb_update() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 31/84] tls: hw: rx: use return value of tls_device_decrypted() to carry status Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 32/84] tls: rx: drop unnecessary arguments from tls_setup_from_iter() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 33/84] tls: rx: dont report text length from the bowels of decrypt Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 34/84] tls: rx: wrap decryption arguments in a structure Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 35/84] tls: rx: factor out writing ContentType to cmsg Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 36/84] tls: rx: dont track the async count Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 37/84] tls: rx: move counting TlsDecryptErrors for sync Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 38/84] tls: rx: assume crypto always calls our callback Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 39/84] tls: rx: use async as an in-out argument Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 40/84] tls: decrement decrypt_pending if no async completion will be called Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 41/84] efi/capsule-loader: fix incorrect allocation size Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 42/84] power: supply: bq27xxx-i2c: Do not free non existing IRQ Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 43/84] ALSA: Drop leftover snd-rtctimer stuff from Makefile Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 44/84] fbcon: always restore the old font data in fbcon_do_set_font() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 45/84] afs: Fix endless loop in directory parsing Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 46/84] riscv: Sparse-Memory/vmemmap out-of-bounds fix Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 47/84] tomoyo: fix UAF write bug in tomoyo_write_control() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 48/84] ALSA: firewire-lib: fix to check cycle continuity Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 49/84] gtp: fix use-after-free and null-ptr-deref in gtp_newlink() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 50/84] wifi: nl80211: reject iftype change with mesh ID change Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 51/84] btrfs: dev-replace: properly validate device names Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 52/84] dmaengine: fsl-qdma: fix SoC may hang on 16 byte unaligned read Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 53/84] dmaengine: ptdma: use consistent DMA masks Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 54/84] dmaengine: fsl-qdma: init irq after reg initialization Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 55/84] mmc: core: Fix eMMC initialization with 1-bit bus connection Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 56/84] mmc: sdhci-xenon: add timeout for PHY init complete Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 57/84] mmc: sdhci-xenon: fix PHY init clock stability Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 58/84] riscv: add CALLER_ADDRx support Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 59/84] pmdomain: qcom: rpmhpd: Fix enabled_corner aggregation Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 60/84] x86/cpu/intel: Detect TME keyid bits before setting MTRR mask registers Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 61/84] mptcp: move __mptcp_error_report in protocol.c Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 62/84] mptcp: process pending subflow error on close Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 63/84] mptcp: rename timer related helper to less confusing names Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 64/84] selftests: mptcp: add missing kconfig for NF Filter Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 65/84] selftests: mptcp: add missing kconfig for NF Filter in v6 Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 66/84] mptcp: clean up harmless false expressions Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 67/84] mptcp: add needs_id for netlink appending addr Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 68/84] mptcp: push at DSS boundaries Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 69/84] mptcp: fix possible deadlock in subflow diag Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 70/84] cachefiles: fix memory leak in cachefiles_add_cache() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 71/84] fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 72/84] Revert "drm/bridge: lt8912b: Register and attach our DSI device at probe" Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 73/84] af_unix: Drop oob_skb ref before purging queue in GC Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 74/84] gpio: 74x164: Enable output pins after registers are reset Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 75/84] gpiolib: Fix the error path order in gpiochip_add_data_with_key() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 76/84] gpio: fix resource unwinding order in error path Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 77/84] Revert "interconnect: Fix locking for runpm vs reclaim" Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 78/84] Revert "interconnect: Teach lockdep about icc_bw_lock order" Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 79/84] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup Greg Kroah-Hartman
2024-03-04 21:24 ` Greg Kroah-Hartman [this message]
2024-03-04 21:24 ` [PATCH 5.15 81/84] bpf: Derive source IP addr via bpf_*_fib_lookup() Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 82/84] net: tls: fix async vs NIC crypto offload Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 83/84] Revert "tls: rx: move counting TlsDecryptErrors for sync" Greg Kroah-Hartman
2024-03-04 21:24 ` [PATCH 5.15 84/84] mptcp: fix double-free on socket dismantle Greg Kroah-Hartman
2024-03-04 22:50 ` [PATCH 5.15 00/84] 5.15.151-rc1 review SeongJae Park
2024-03-05 4:52 ` Ron Economos
2024-03-05 10:08 ` Naresh Kamboju
2024-03-05 11:30 ` Greg Kroah-Hartman
2024-03-05 10:58 ` Jon Hunter
2024-03-05 11:43 ` Harshit Mogalapalli
2024-03-05 19:05 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240304211545.082544823@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=daniel@iogearbox.net \
--cc=louis.delos.devel@gmail.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.