* [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup @ 2023-02-17 20:55 Martin KaFai Lau 2023-02-17 20:55 ` [PATCH v3 bpf-next 2/2] selftests/bpf: Add bpf_fib_lookup test Martin KaFai Lau 2023-02-17 21:20 ` [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup patchwork-bot+netdevbpf 0 siblings, 2 replies; 3+ messages in thread From: Martin KaFai Lau @ 2023-02-17 20:55 UTC (permalink / raw) To: bpf Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ', 'Daniel Borkmann ', netdev, kernel-team From: Martin KaFai Lau <martin.lau@kernel.org> The bpf_fib_lookup() also looks up the neigh table. This was done before bpf_redirect_neigh() was added. In the use case that does not manage the neigh table and requires bpf_fib_lookup() to lookup a fib to decide if it needs to redirect or not, the bpf prog can depend only on using bpf_redirect_neigh() to lookup the neigh. It also keeps the neigh entries fresh and connected. This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid the double neigh lookup when the bpf prog always call bpf_redirect_neigh() to do the neigh lookup. The params->smac output is skipped together when SKIP_NEIGH is set because bpf_redirect_neigh() will figure out the smac also. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> --- v3: - Add documentation for BPF_FIB_LOOKUP_SKIP_NEIGH v2: - Skip copying smac when the SKIP_NEIGH is set - Keep the ordering of the (nhc->nhc_gw_family != AF_INET6) test include/uapi/linux/bpf.h | 6 ++++++ net/core/filter.c | 39 ++++++++++++++++++++++------------ tools/include/uapi/linux/bpf.h | 6 ++++++ 3 files changed, 38 insertions(+), 13 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 1503f61336b6..62ce1f5d1b1d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3134,6 +3134,11 @@ union bpf_attr { * **BPF_FIB_LOOKUP_OUTPUT** * Perform lookup from an egress perspective (default is * ingress). + * **BPF_FIB_LOOKUP_SKIP_NEIGH** + * Skip the neighbour table lookup. *params*->dmac + * and *params*->smac will not be set as output. A common + * use case is to call **bpf_redirect_neigh**\ () after + * doing **bpf_fib_lookup**\ (). * * *ctx* is either **struct xdp_md** for XDP programs or * **struct sk_buff** tc cls_act programs. @@ -6750,6 +6755,7 @@ struct bpf_raw_tracepoint_args { enum { BPF_FIB_LOOKUP_DIRECT = (1U << 0), BPF_FIB_LOOKUP_OUTPUT = (1U << 1), + BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), }; enum { diff --git a/net/core/filter.c b/net/core/filter.c index 8daaaf76ab15..1d6f165923bf 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5722,12 +5722,8 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = { #endif #if IS_ENABLED(CONFIG_INET) || IS_ENABLED(CONFIG_IPV6) -static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params, - const struct neighbour *neigh, - const struct net_device *dev, u32 mtu) +static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params, u32 mtu) { - memcpy(params->dmac, neigh->ha, ETH_ALEN); - memcpy(params->smac, dev->dev_addr, ETH_ALEN); params->h_vlan_TCI = 0; params->h_vlan_proto = 0; if (mtu) @@ -5838,21 +5834,29 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params, if (likely(nhc->nhc_gw_family != AF_INET6)) { if (nhc->nhc_gw_family) params->ipv4_dst = nhc->nhc_gw.ipv4; - - neigh = __ipv4_neigh_lookup_noref(dev, - (__force u32)params->ipv4_dst); } else { struct in6_addr *dst = (struct in6_addr *)params->ipv6_dst; params->family = AF_INET6; *dst = nhc->nhc_gw.ipv6; - neigh = __ipv6_neigh_lookup_noref_stub(dev, dst); } + if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) + goto set_fwd_params; + + if (likely(nhc->nhc_gw_family != AF_INET6)) + neigh = __ipv4_neigh_lookup_noref(dev, + (__force u32)params->ipv4_dst); + else + neigh = __ipv6_neigh_lookup_noref_stub(dev, params->ipv6_dst); + if (!neigh || !(neigh->nud_state & NUD_VALID)) return BPF_FIB_LKUP_RET_NO_NEIGH; + memcpy(params->dmac, neigh->ha, ETH_ALEN); + memcpy(params->smac, dev->dev_addr, ETH_ALEN); - return bpf_fib_set_fwd_params(params, neigh, dev, mtu); +set_fwd_params: + return bpf_fib_set_fwd_params(params, mtu); } #endif @@ -5960,24 +5964,33 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, params->rt_metric = res.f6i->fib6_metric; params->ifindex = dev->ifindex; + if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) + goto set_fwd_params; + /* xdp and cls_bpf programs are run in RCU-bh so rcu_read_lock_bh is * not needed here. */ neigh = __ipv6_neigh_lookup_noref_stub(dev, dst); if (!neigh || !(neigh->nud_state & NUD_VALID)) return BPF_FIB_LKUP_RET_NO_NEIGH; + memcpy(params->dmac, neigh->ha, ETH_ALEN); + memcpy(params->smac, dev->dev_addr, ETH_ALEN); - return bpf_fib_set_fwd_params(params, neigh, dev, mtu); +set_fwd_params: + return bpf_fib_set_fwd_params(params, mtu); } #endif +#define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \ + BPF_FIB_LOOKUP_SKIP_NEIGH) + BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx, struct bpf_fib_lookup *, params, int, plen, u32, flags) { if (plen < sizeof(*params)) return -EINVAL; - if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT)) + if (flags & ~BPF_FIB_LOOKUP_MASK) return -EINVAL; switch (params->family) { @@ -6015,7 +6028,7 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb, if (plen < sizeof(*params)) return -EINVAL; - if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT)) + if (flags & ~BPF_FIB_LOOKUP_MASK) return -EINVAL; if (params->tot_len) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1503f61336b6..62ce1f5d1b1d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3134,6 +3134,11 @@ union bpf_attr { * **BPF_FIB_LOOKUP_OUTPUT** * Perform lookup from an egress perspective (default is * ingress). + * **BPF_FIB_LOOKUP_SKIP_NEIGH** + * Skip the neighbour table lookup. *params*->dmac + * and *params*->smac will not be set as output. A common + * use case is to call **bpf_redirect_neigh**\ () after + * doing **bpf_fib_lookup**\ (). * * *ctx* is either **struct xdp_md** for XDP programs or * **struct sk_buff** tc cls_act programs. @@ -6750,6 +6755,7 @@ struct bpf_raw_tracepoint_args { enum { BPF_FIB_LOOKUP_DIRECT = (1U << 0), BPF_FIB_LOOKUP_OUTPUT = (1U << 1), + BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), }; enum { -- 2.30.2 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v3 bpf-next 2/2] selftests/bpf: Add bpf_fib_lookup test 2023-02-17 20:55 [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup Martin KaFai Lau @ 2023-02-17 20:55 ` Martin KaFai Lau 2023-02-17 21:20 ` [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup patchwork-bot+netdevbpf 1 sibling, 0 replies; 3+ messages in thread From: Martin KaFai Lau @ 2023-02-17 20:55 UTC (permalink / raw) To: bpf Cc: 'Alexei Starovoitov ', 'Andrii Nakryiko ', 'Daniel Borkmann ', netdev, kernel-team From: Martin KaFai Lau <martin.lau@kernel.org> This patch tests the bpf_fib_lookup helper when looking up a neigh in NUD_FAILED and NUD_STALE state. It also adds test for the new BPF_FIB_LOOKUP_SKIP_NEIGH flag. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> --- .../selftests/bpf/prog_tests/fib_lookup.c | 187 ++++++++++++++++++ .../testing/selftests/bpf/progs/fib_lookup.c | 22 +++ 2 files changed, 209 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/fib_lookup.c create mode 100644 tools/testing/selftests/bpf/progs/fib_lookup.c diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c new file mode 100644 index 000000000000..61ccddccf485 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c @@ -0,0 +1,187 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <sys/types.h> +#include <net/if.h> + +#include "test_progs.h" +#include "network_helpers.h" +#include "fib_lookup.skel.h" + +#define SYS(fmt, ...) \ + ({ \ + char cmd[1024]; \ + snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \ + if (!ASSERT_OK(system(cmd), cmd)) \ + goto fail; \ + }) + +#define NS_TEST "fib_lookup_ns" +#define IPV6_IFACE_ADDR "face::face" +#define IPV6_NUD_FAILED_ADDR "face::1" +#define IPV6_NUD_STALE_ADDR "face::2" +#define IPV4_IFACE_ADDR "10.0.0.254" +#define IPV4_NUD_FAILED_ADDR "10.0.0.1" +#define IPV4_NUD_STALE_ADDR "10.0.0.2" +#define DMAC "11:11:11:11:11:11" +#define DMAC_INIT { 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, } + +struct fib_lookup_test { + const char *desc; + const char *daddr; + int expected_ret; + int lookup_flags; + __u8 dmac[6]; +}; + +static const struct fib_lookup_test tests[] = { + { .desc = "IPv6 failed neigh", + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_NO_NEIGH, }, + { .desc = "IPv6 stale neigh", + .daddr = IPV6_NUD_STALE_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .dmac = DMAC_INIT, }, + { .desc = "IPv6 skip neigh", + .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .lookup_flags = BPF_FIB_LOOKUP_SKIP_NEIGH, }, + { .desc = "IPv4 failed neigh", + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_NO_NEIGH, }, + { .desc = "IPv4 stale neigh", + .daddr = IPV4_NUD_STALE_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .dmac = DMAC_INIT, }, + { .desc = "IPv4 skip neigh", + .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS, + .lookup_flags = BPF_FIB_LOOKUP_SKIP_NEIGH, }, +}; + +static int ifindex; + +static int setup_netns(void) +{ + int err; + + SYS("ip link add veth1 type veth peer name veth2"); + SYS("ip link set dev veth1 up"); + + SYS("ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR); + SYS("ip neigh add %s dev veth1 nud failed", IPV6_NUD_FAILED_ADDR); + SYS("ip neigh add %s dev veth1 lladdr %s nud stale", IPV6_NUD_STALE_ADDR, DMAC); + + SYS("ip addr add %s/24 dev veth1 nodad", IPV4_IFACE_ADDR); + SYS("ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR); + SYS("ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC); + + err = write_sysctl("/proc/sys/net/ipv4/conf/veth1/forwarding", "1"); + if (!ASSERT_OK(err, "write_sysctl(net.ipv4.conf.veth1.forwarding)")) + goto fail; + + err = write_sysctl("/proc/sys/net/ipv6/conf/veth1/forwarding", "1"); + if (!ASSERT_OK(err, "write_sysctl(net.ipv6.conf.veth1.forwarding)")) + goto fail; + + return 0; +fail: + return -1; +} + +static int set_lookup_params(struct bpf_fib_lookup *params, const char *daddr) +{ + int ret; + + memset(params, 0, sizeof(*params)); + + params->l4_protocol = IPPROTO_TCP; + params->ifindex = ifindex; + + if (inet_pton(AF_INET6, daddr, params->ipv6_dst) == 1) { + params->family = AF_INET6; + ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)")) + return -1; + return 0; + } + + ret = inet_pton(AF_INET, daddr, ¶ms->ipv4_dst); + if (!ASSERT_EQ(ret, 1, "convert IP[46] address")) + return -1; + params->family = AF_INET; + ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src); + if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)")) + return -1; + + return 0; +} + +static void mac_str(char *b, const __u8 *mac) +{ + sprintf(b, "%02X:%02X:%02X:%02X:%02X:%02X", + mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]); +} + +void test_fib_lookup(void) +{ + struct bpf_fib_lookup *fib_params; + struct nstoken *nstoken = NULL; + struct __sk_buff skb = { }; + struct fib_lookup *skel; + int prog_fd, err, ret, i; + + /* The test does not use the skb->data, so + * use pkt_v6 for both v6 and v4 test. + */ + LIBBPF_OPTS(bpf_test_run_opts, run_opts, + .data_in = &pkt_v6, + .data_size_in = sizeof(pkt_v6), + .ctx_in = &skb, + .ctx_size_in = sizeof(skb), + ); + + skel = fib_lookup__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel open_and_load")) + return; + prog_fd = bpf_program__fd(skel->progs.fib_lookup); + + SYS("ip netns add %s", NS_TEST); + + nstoken = open_netns(NS_TEST); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto fail; + + if (setup_netns()) + goto fail; + + ifindex = if_nametoindex("veth1"); + skb.ifindex = ifindex; + fib_params = &skel->bss->fib_params; + + for (i = 0; i < ARRAY_SIZE(tests); i++) { + printf("Testing %s\n", tests[i].desc); + + if (set_lookup_params(fib_params, tests[i].daddr)) + continue; + skel->bss->fib_lookup_ret = -1; + skel->bss->lookup_flags = BPF_FIB_LOOKUP_OUTPUT | + tests[i].lookup_flags; + + err = bpf_prog_test_run_opts(prog_fd, &run_opts); + if (!ASSERT_OK(err, "bpf_prog_test_run_opts")) + continue; + + ASSERT_EQ(tests[i].expected_ret, skel->bss->fib_lookup_ret, + "fib_lookup_ret"); + + ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac)); + if (!ASSERT_EQ(ret, 0, "dmac not match")) { + char expected[18], actual[18]; + + mac_str(expected, tests[i].dmac); + mac_str(actual, fib_params->dmac); + printf("dmac expected %s actual %s\n", expected, actual); + } + } + +fail: + if (nstoken) + close_netns(nstoken); + system("ip netns del " NS_TEST " &> /dev/null"); + fib_lookup__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/fib_lookup.c b/tools/testing/selftests/bpf/progs/fib_lookup.c new file mode 100644 index 000000000000..c4514dd58c62 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/fib_lookup.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include <linux/types.h> +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_tracing_net.h" + +struct bpf_fib_lookup fib_params = {}; +int fib_lookup_ret = 0; +int lookup_flags = 0; + +SEC("tc") +int fib_lookup(struct __sk_buff *skb) +{ + fib_lookup_ret = bpf_fib_lookup(skb, &fib_params, sizeof(fib_params), + lookup_flags); + + return TC_ACT_SHOT; +} + +char _license[] SEC("license") = "GPL"; -- 2.30.2 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup 2023-02-17 20:55 [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup Martin KaFai Lau 2023-02-17 20:55 ` [PATCH v3 bpf-next 2/2] selftests/bpf: Add bpf_fib_lookup test Martin KaFai Lau @ 2023-02-17 21:20 ` patchwork-bot+netdevbpf 1 sibling, 0 replies; 3+ messages in thread From: patchwork-bot+netdevbpf @ 2023-02-17 21:20 UTC (permalink / raw) To: Martin KaFai Lau; +Cc: bpf, ast, andrii, daniel, netdev, kernel-team Hello: This series was applied to bpf/bpf-next.git (master) by Daniel Borkmann <daniel@iogearbox.net>: On Fri, 17 Feb 2023 12:55:14 -0800 you wrote: > From: Martin KaFai Lau <martin.lau@kernel.org> > > The bpf_fib_lookup() also looks up the neigh table. > This was done before bpf_redirect_neigh() was added. > > In the use case that does not manage the neigh table > and requires bpf_fib_lookup() to lookup a fib to > decide if it needs to redirect or not, the bpf prog can > depend only on using bpf_redirect_neigh() to lookup the > neigh. It also keeps the neigh entries fresh and connected. > > [...] Here is the summary with links: - [v3,bpf-next,1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup https://git.kernel.org/bpf/bpf-next/c/31de4105f00d - [v3,bpf-next,2/2] selftests/bpf: Add bpf_fib_lookup test https://git.kernel.org/bpf/bpf-next/c/168de0233586 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-02-17 21:20 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-17 20:55 [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup Martin KaFai Lau 2023-02-17 20:55 ` [PATCH v3 bpf-next 2/2] selftests/bpf: Add bpf_fib_lookup test Martin KaFai Lau 2023-02-17 21:20 ` [PATCH v3 bpf-next 1/2] bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox