[PATCH RFC bpf-next 0/2] bpf: Rework bpf_redirect_neigh() to allow supplying nexthop from caller

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH RFC bpf-next 0/2] bpf: Rework bpf_redirect_neigh() to allow supplying nexthop from caller
@ 2020-10-15 15:46 Toke Høiland-Jørgensen
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
  2020-10-15 15:46 ` [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh() Toke Høiland-Jørgensen
  0 siblings, 2 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-15 15:46 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Ahern, netdev, bpf

Based on previous discussion[0], we determined that it would be beneficial to
rework bpf_redirect_neigh() so the caller can supply the nexthop information
(e.g., from a previous call to bpf_fib_lookup()). This way, the two helpers can
be combined without incurring a second FIB lookup to find the nexthop, and
bpf_fib_lookup() becomes usable even if no nexthop entry currently exists.

This patch (and accompanying selftest update) accomplishes this by way of an
optional paramter to bpf_redirect_neigh(). This is an API change, and so should
really be merged into the bpf tree to be part of the 5.10 cycle; however, since
bpf-next has not yet been merged into bpf, I'm sending this as an RFC against
bpf-next for discussion, and will repost against bpf once that merge happens
(Daniel, unless you have a better way of doing this, of course).

-Toke

[0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/

---

Toke Høiland-Jørgensen (2):
      bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
      selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh()

 .../selftests/bpf/progs/test_tc_neigh.c       | 83 ++++++++++++++++---
 .../testing/selftests/bpf/test_tc_redirect.sh |  8 +-
 2 files changed, 78 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 [PATCH RFC bpf-next 0/2] bpf: Rework bpf_redirect_neigh() to allow supplying nexthop from caller Toke Høiland-Jørgensen
@ 2020-10-15 15:46 ` Toke Høiland-Jørgensen
  2020-10-15 16:27   ` David Ahern
                     ` (5 more replies)
  2020-10-15 15:46 ` [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh() Toke Høiland-Jørgensen
  1 sibling, 6 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-15 15:46 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Ahern, netdev, bpf

From: Toke Høiland-Jørgensen <toke@redhat.com>

Based on the discussion in [0], update the bpf_redirect_neigh() helper to
accept an optional parameter specifying the nexthop information. This makes
it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without
incurring a duplicate FIB lookup - since the FIB lookup helper will return
the nexthop information even if no neighbour is present, this can simply be
passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns
BPF_FIB_LKUP_RET_NO_NEIGH.

[0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/linux/filter.h         |    9 ++
 include/uapi/linux/bpf.h       |   23 +++++-
 net/core/filter.c              |  152 +++++++++++++++++++++++++---------------
 scripts/bpf_helpers_doc.py     |    1 
 tools/include/uapi/linux/bpf.h |   23 +++++-
 5 files changed, 143 insertions(+), 65 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 20fc24c9779a..ba9de7188cd0 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -607,12 +607,21 @@ struct bpf_skb_data_end {
 	void *data_end;
 };
 
+struct bpf_nh_params {
+	u8 nh_family;
+	union {
+		__u32 ipv4_nh;
+		struct in6_addr ipv6_nh;
+	};
+};
+
 struct bpf_redirect_info {
 	u32 flags;
 	u32 tgt_index;
 	void *tgt_value;
 	struct bpf_map *map;
 	u32 kern_flags;
+	struct bpf_nh_params nh;
 };
 
 DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index bf5a99d803e4..980cc1363be8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3677,15 +3677,19 @@ union bpf_attr {
  * 	Return
  * 		The id is returned or 0 in case the id could not be retrieved.
  *
- * long bpf_redirect_neigh(u32 ifindex, u64 flags)
+ * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)
  * 	Description
  * 		Redirect the packet to another net device of index *ifindex*
  * 		and fill in L2 addresses from neighboring subsystem. This helper
  * 		is somewhat similar to **bpf_redirect**\ (), except that it
  * 		populates L2 addresses as well, meaning, internally, the helper
- * 		performs a FIB lookup based on the skb's networking header to
- * 		get the address of the next hop and then relies on the neighbor
- * 		lookup for the L2 address of the nexthop.
+ * 		relies on the neighbor lookup for the L2 address of the nexthop.
+ *
+ * 		The helper will perform a FIB lookup based on the skb's
+ * 		networking header to get the address of the next hop, unless
+ * 		this is supplied by the caller in the *params* argument. The
+ * 		*plen* argument indicates the len of *params* and should be set
+ * 		to 0 if *params* is NULL.
  *
  * 		The *flags* argument is reserved and must be 0. The helper is
  * 		currently only supported for tc BPF program types, and enabled
@@ -4906,6 +4910,17 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+struct bpf_redir_neigh {
+	/* network family for lookup (AF_INET, AF_INET6)
+	 */
+	__u8	nh_family;
+	/* network address of nexthop; skips fib lookup to find gateway */
+	union {
+		__be32		ipv4_nh;
+		__u32		ipv6_nh[4];  /* in6_addr; network order */
+	};
+};
+
 enum bpf_task_fd_type {
 	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
 	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
diff --git a/net/core/filter.c b/net/core/filter.c
index c5e2a1c5fd8d..d073031a3a61 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2165,12 +2165,11 @@ static int __bpf_redirect(struct sk_buff *skb, struct net_device *dev,
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
+static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb,
+			    struct net_device *dev, const struct in6_addr *nexthop)
 {
-	struct dst_entry *dst = skb_dst(skb);
-	struct net_device *dev = dst->dev;
 	u32 hh_len = LL_RESERVED_SPACE(dev);
-	const struct in6_addr *nexthop;
+	struct dst_entry *dst = NULL;
 	struct neighbour *neigh;
 
 	if (dev_xmit_recursion()) {
@@ -2196,8 +2195,11 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
 	}
 
 	rcu_read_lock_bh();
-	nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
-			      &ipv6_hdr(skb)->daddr);
+	if (!nexthop) {
+		dst = skb_dst(skb);
+		nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
+				      &ipv6_hdr(skb)->daddr);
+	}
 	neigh = ip_neigh_gw6(dev, nexthop);
 	if (likely(!IS_ERR(neigh))) {
 		int ret;
@@ -2210,36 +2212,46 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
 		return ret;
 	}
 	rcu_read_unlock_bh();
-	IP6_INC_STATS(dev_net(dst->dev),
-		      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
+	if (dst)
+		IP6_INC_STATS(dev_net(dst->dev),
+			      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
 out_drop:
 	kfree_skb(skb);
 	return -ENETDOWN;
 }
 
-static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
+static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
+				   struct bpf_nh_params *nh)
 {
 	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
+	struct in6_addr *nexthop = NULL;
 	struct net *net = dev_net(dev);
 	int err, ret = NET_XMIT_DROP;
-	struct dst_entry *dst;
-	struct flowi6 fl6 = {
-		.flowi6_flags	= FLOWI_FLAG_ANYSRC,
-		.flowi6_mark	= skb->mark,
-		.flowlabel	= ip6_flowinfo(ip6h),
-		.flowi6_oif	= dev->ifindex,
-		.flowi6_proto	= ip6h->nexthdr,
-		.daddr		= ip6h->daddr,
-		.saddr		= ip6h->saddr,
-	};
 
-	dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
-	if (IS_ERR(dst))
-		goto out_drop;
+	if (!nh->nh_family) {
+		struct dst_entry *dst;
+		struct flowi6 fl6 = {
+			.flowi6_flags = FLOWI_FLAG_ANYSRC,
+			.flowi6_mark = skb->mark,
+			.flowlabel = ip6_flowinfo(ip6h),
+			.flowi6_oif = dev->ifindex,
+			.flowi6_proto = ip6h->nexthdr,
+			.daddr = ip6h->daddr,
+			.saddr = ip6h->saddr,
+		};
+
+		dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
+		if (IS_ERR(dst))
+			goto out_drop;
 
-	skb_dst_set(skb, dst);
+		skb_dst_set(skb, dst);
+	} else if (nh->nh_family == AF_INET6) {
+		nexthop = &nh->ipv6_nh;
+	} else {
+		goto out_drop;
+	}
 
-	err = bpf_out_neigh_v6(net, skb);
+	err = bpf_out_neigh_v6(net, skb, dev, nexthop);
 	if (unlikely(net_xmit_eval(err)))
 		dev->stats.tx_errors++;
 	else
@@ -2260,11 +2272,9 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
 #endif /* CONFIG_IPV6 */
 
 #if IS_ENABLED(CONFIG_INET)
-static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb)
+static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb,
+			    struct net_device *dev, struct bpf_nh_params *nh)
 {
-	struct dst_entry *dst = skb_dst(skb);
-	struct rtable *rt = container_of(dst, struct rtable, dst);
-	struct net_device *dev = dst->dev;
 	u32 hh_len = LL_RESERVED_SPACE(dev);
 	struct neighbour *neigh;
 	bool is_v6gw = false;
@@ -2292,7 +2302,20 @@ static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb)
 	}
 
 	rcu_read_lock_bh();
-	neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
+	if (!nh) {
+		struct dst_entry *dst = skb_dst(skb);
+		struct rtable *rt = container_of(dst, struct rtable, dst);
+
+		neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
+	} else if (nh->nh_family == AF_INET6) {
+		neigh = ip_neigh_gw6(dev, &nh->ipv6_nh);
+		is_v6gw = true;
+	} else if (nh->nh_family == AF_INET) {
+		neigh = ip_neigh_gw4(dev, nh->ipv4_nh);
+	} else {
+		goto out_drop;
+	}
+
 	if (likely(!IS_ERR(neigh))) {
 		int ret;
 
@@ -2309,33 +2332,38 @@ static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb)
 	return -ENETDOWN;
 }
 
-static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
+static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
+				   struct bpf_nh_params *nh)
 {
 	const struct iphdr *ip4h = ip_hdr(skb);
 	struct net *net = dev_net(dev);
 	int err, ret = NET_XMIT_DROP;
-	struct rtable *rt;
-	struct flowi4 fl4 = {
-		.flowi4_flags	= FLOWI_FLAG_ANYSRC,
-		.flowi4_mark	= skb->mark,
-		.flowi4_tos	= RT_TOS(ip4h->tos),
-		.flowi4_oif	= dev->ifindex,
-		.flowi4_proto	= ip4h->protocol,
-		.daddr		= ip4h->daddr,
-		.saddr		= ip4h->saddr,
-	};
 
-	rt = ip_route_output_flow(net, &fl4, NULL);
-	if (IS_ERR(rt))
-		goto out_drop;
-	if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
-		ip_rt_put(rt);
-		goto out_drop;
-	}
+	if (!nh->nh_family) {
+		struct rtable *rt;
+		struct flowi4 fl4 = {
+			.flowi4_flags = FLOWI_FLAG_ANYSRC,
+			.flowi4_mark = skb->mark,
+			.flowi4_tos = RT_TOS(ip4h->tos),
+			.flowi4_oif = dev->ifindex,
+			.flowi4_proto = ip4h->protocol,
+			.daddr = ip4h->daddr,
+			.saddr = ip4h->saddr,
+		};
+
+		rt = ip_route_output_flow(net, &fl4, NULL);
+		if (IS_ERR(rt))
+			goto out_drop;
+		if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
+			ip_rt_put(rt);
+			goto out_drop;
+		}
 
-	skb_dst_set(skb, &rt->dst);
+		skb_dst_set(skb, &rt->dst);
+		nh = NULL;
+	}
 
-	err = bpf_out_neigh_v4(net, skb);
+	err = bpf_out_neigh_v4(net, skb, dev, nh);
 	if (unlikely(net_xmit_eval(err)))
 		dev->stats.tx_errors++;
 	else
@@ -2355,7 +2383,8 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
 }
 #endif /* CONFIG_INET */
 
-static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
+static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev,
+				struct bpf_nh_params *nh)
 {
 	struct ethhdr *ethh = eth_hdr(skb);
 
@@ -2370,9 +2399,9 @@ static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
 	skb_reset_network_header(skb);
 
 	if (skb->protocol == htons(ETH_P_IP))
-		return __bpf_redirect_neigh_v4(skb, dev);
+		return __bpf_redirect_neigh_v4(skb, dev, nh);
 	else if (skb->protocol == htons(ETH_P_IPV6))
-		return __bpf_redirect_neigh_v6(skb, dev);
+		return __bpf_redirect_neigh_v6(skb, dev, nh);
 out:
 	kfree_skb(skb);
 	return -ENOTSUPP;
@@ -2455,8 +2484,8 @@ int skb_do_redirect(struct sk_buff *skb)
 		return -EAGAIN;
 	}
 	return flags & BPF_F_NEIGH ?
-	       __bpf_redirect_neigh(skb, dev) :
-	       __bpf_redirect(skb, dev, flags);
+		__bpf_redirect_neigh(skb, dev, &ri->nh) :
+		__bpf_redirect(skb, dev, flags);
 out_drop:
 	kfree_skb(skb);
 	return -EINVAL;
@@ -2504,16 +2533,23 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = {
 	.arg2_type      = ARG_ANYTHING,
 };
 
-BPF_CALL_2(bpf_redirect_neigh, u32, ifindex, u64, flags)
+BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params,
+	   int, plen, u64, flags)
 {
 	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
 
-	if (unlikely(flags))
+	if (unlikely((plen && plen < sizeof(*params)) || flags))
 		return TC_ACT_SHOT;
 
 	ri->flags = BPF_F_NEIGH;
 	ri->tgt_index = ifindex;
 
+	BUILD_BUG_ON(sizeof(struct bpf_redir_neigh) != sizeof(struct bpf_nh_params));
+	if (plen)
+		memcpy(&ri->nh, params, sizeof(ri->nh));
+	else
+		ri->nh.nh_family = 0; /* clear previous value */
+
 	return TC_ACT_REDIRECT;
 }
 
@@ -2522,7 +2558,9 @@ static const struct bpf_func_proto bpf_redirect_neigh_proto = {
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_ANYTHING,
-	.arg2_type	= ARG_ANYTHING,
+	.arg2_type      = ARG_PTR_TO_MEM_OR_NULL,
+	.arg3_type      = ARG_CONST_SIZE_OR_ZERO,
+	.arg4_type	= ARG_ANYTHING,
 };
 
 BPF_CALL_2(bpf_msg_apply_bytes, struct sk_msg *, msg, u32, bytes)
diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py
index 7d86fdd190be..6769caae142f 100755
--- a/scripts/bpf_helpers_doc.py
+++ b/scripts/bpf_helpers_doc.py
@@ -453,6 +453,7 @@ class PrinterHelpers(Printer):
             'struct bpf_perf_event_data',
             'struct bpf_perf_event_value',
             'struct bpf_pidns_info',
+            'struct bpf_redir_neigh',
             'struct bpf_sk_lookup',
             'struct bpf_sock',
             'struct bpf_sock_addr',
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index bf5a99d803e4..980cc1363be8 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3677,15 +3677,19 @@ union bpf_attr {
  * 	Return
  * 		The id is returned or 0 in case the id could not be retrieved.
  *
- * long bpf_redirect_neigh(u32 ifindex, u64 flags)
+ * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)
  * 	Description
  * 		Redirect the packet to another net device of index *ifindex*
  * 		and fill in L2 addresses from neighboring subsystem. This helper
  * 		is somewhat similar to **bpf_redirect**\ (), except that it
  * 		populates L2 addresses as well, meaning, internally, the helper
- * 		performs a FIB lookup based on the skb's networking header to
- * 		get the address of the next hop and then relies on the neighbor
- * 		lookup for the L2 address of the nexthop.
+ * 		relies on the neighbor lookup for the L2 address of the nexthop.
+ *
+ * 		The helper will perform a FIB lookup based on the skb's
+ * 		networking header to get the address of the next hop, unless
+ * 		this is supplied by the caller in the *params* argument. The
+ * 		*plen* argument indicates the len of *params* and should be set
+ * 		to 0 if *params* is NULL.
  *
  * 		The *flags* argument is reserved and must be 0. The helper is
  * 		currently only supported for tc BPF program types, and enabled
@@ -4906,6 +4910,17 @@ struct bpf_fib_lookup {
 	__u8	dmac[6];     /* ETH_ALEN */
 };
 
+struct bpf_redir_neigh {
+	/* network family for lookup (AF_INET, AF_INET6)
+	 */
+	__u8	nh_family;
+	/* network address of nexthop; skips fib lookup to find gateway */
+	union {
+		__be32		ipv4_nh;
+		__u32		ipv6_nh[4];  /* in6_addr; network order */
+	};
+};
+
 enum bpf_task_fd_type {
 	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
 	BPF_FD_TYPE_TRACEPOINT,		/* tp name */


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
@ 2020-10-15 16:27   ` David Ahern
  2020-10-15 19:34     ` Toke Høiland-Jørgensen
  2020-10-15 18:17   ` kernel test robot
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: David Ahern @ 2020-10-15 16:27 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Daniel Borkmann
  Cc: David Ahern, netdev, bpf

On 10/15/20 9:46 AM, Toke Høiland-Jørgensen wrote:
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index bf5a99d803e4..980cc1363be8 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3677,15 +3677,19 @@ union bpf_attr {
>   * 	Return
>   * 		The id is returned or 0 in case the id could not be retrieved.
>   *
> - * long bpf_redirect_neigh(u32 ifindex, u64 flags)
> + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)

why not fold ifindex into params? with params and plen this should be
extensible later if needed.

A couple of nits below that caught me eye.


>   * 	Description
>   * 		Redirect the packet to another net device of index *ifindex*
>   * 		and fill in L2 addresses from neighboring subsystem. This helper
>   * 		is somewhat similar to **bpf_redirect**\ (), except that it
>   * 		populates L2 addresses as well, meaning, internally, the helper
> - * 		performs a FIB lookup based on the skb's networking header to
> - * 		get the address of the next hop and then relies on the neighbor
> - * 		lookup for the L2 address of the nexthop.
> + * 		relies on the neighbor lookup for the L2 address of the nexthop.
> + *
> + * 		The helper will perform a FIB lookup based on the skb's
> + * 		networking header to get the address of the next hop, unless
> + * 		this is supplied by the caller in the *params* argument. The
> + * 		*plen* argument indicates the len of *params* and should be set
> + * 		to 0 if *params* is NULL.
>   *
>   * 		The *flags* argument is reserved and must be 0. The helper is
>   * 		currently only supported for tc BPF program types, and enabled
> @@ -4906,6 +4910,17 @@ struct bpf_fib_lookup {
>  	__u8	dmac[6];     /* ETH_ALEN */
>  };
>  
> +struct bpf_redir_neigh {
> +	/* network family for lookup (AF_INET, AF_INET6)
> +	 */

second line for the comment is not needed.

> +	__u8	nh_family;
> +	/* network address of nexthop; skips fib lookup to find gateway */
> +	union {
> +		__be32		ipv4_nh;
> +		__u32		ipv6_nh[4];  /* in6_addr; network order */
> +	};
> +};
> +
>  enum bpf_task_fd_type {
>  	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
>  	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c5e2a1c5fd8d..d073031a3a61 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2165,12 +2165,11 @@ static int __bpf_redirect(struct sk_buff *skb, struct net_device *dev,
>  }
>  
>  #if IS_ENABLED(CONFIG_IPV6)
> -static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
> +static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb,
> +			    struct net_device *dev, const struct in6_addr *nexthop)
>  {
> -	struct dst_entry *dst = skb_dst(skb);
> -	struct net_device *dev = dst->dev;
>  	u32 hh_len = LL_RESERVED_SPACE(dev);
> -	const struct in6_addr *nexthop;
> +	struct dst_entry *dst = NULL;
>  	struct neighbour *neigh;
>  
>  	if (dev_xmit_recursion()) {
> @@ -2196,8 +2195,11 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
>  	}
>  
>  	rcu_read_lock_bh();
> -	nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
> -			      &ipv6_hdr(skb)->daddr);
> +	if (!nexthop) {
> +		dst = skb_dst(skb);
> +		nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
> +				      &ipv6_hdr(skb)->daddr);
> +	}
>  	neigh = ip_neigh_gw6(dev, nexthop);
>  	if (likely(!IS_ERR(neigh))) {
>  		int ret;
> @@ -2210,36 +2212,46 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
>  		return ret;
>  	}
>  	rcu_read_unlock_bh();
> -	IP6_INC_STATS(dev_net(dst->dev),
> -		      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
> +	if (dst)
> +		IP6_INC_STATS(dev_net(dst->dev),
> +			      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
>  out_drop:
>  	kfree_skb(skb);
>  	return -ENETDOWN;
>  }
>  
> -static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
> +static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
> +				   struct bpf_nh_params *nh)
>  {
>  	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
> +	struct in6_addr *nexthop = NULL;
>  	struct net *net = dev_net(dev);
>  	int err, ret = NET_XMIT_DROP;
> -	struct dst_entry *dst;
> -	struct flowi6 fl6 = {
> -		.flowi6_flags	= FLOWI_FLAG_ANYSRC,
> -		.flowi6_mark	= skb->mark,
> -		.flowlabel	= ip6_flowinfo(ip6h),
> -		.flowi6_oif	= dev->ifindex,
> -		.flowi6_proto	= ip6h->nexthdr,
> -		.daddr		= ip6h->daddr,
> -		.saddr		= ip6h->saddr,
> -	};
>  
> -	dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
> -	if (IS_ERR(dst))
> -		goto out_drop;
> +	if (!nh->nh_family) {
> +		struct dst_entry *dst;

reverse xmas tree ordering

> +		struct flowi6 fl6 = {
> +			.flowi6_flags = FLOWI_FLAG_ANYSRC,
> +			.flowi6_mark = skb->mark,
> +			.flowlabel = ip6_flowinfo(ip6h),
> +			.flowi6_oif = dev->ifindex,
> +			.flowi6_proto = ip6h->nexthdr,
> +			.daddr = ip6h->daddr,
> +			.saddr = ip6h->saddr,
> +		};
> +
> +		dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
> +		if (IS_ERR(dst))
> +			goto out_drop;
>  
> -	skb_dst_set(skb, dst);
> +		skb_dst_set(skb, dst);
> +	} else if (nh->nh_family == AF_INET6) {
> +		nexthop = &nh->ipv6_nh;
> +	} else {
> +		goto out_drop;
> +	}
>  
> -	err = bpf_out_neigh_v6(net, skb);
> +	err = bpf_out_neigh_v6(net, skb, dev, nexthop);
>  	if (unlikely(net_xmit_eval(err)))
>  		dev->stats.tx_errors++;
>  	else



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 16:27   ` David Ahern
@ 2020-10-15 19:34     ` Toke Høiland-Jørgensen
  2020-10-19 13:09       ` Daniel Borkmann
  0 siblings, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-15 19:34 UTC (permalink / raw)
  To: David Ahern, Daniel Borkmann; +Cc: David Ahern, netdev, bpf

David Ahern <dsahern@gmail.com> writes:

> On 10/15/20 9:46 AM, Toke Høiland-Jørgensen wrote:
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index bf5a99d803e4..980cc1363be8 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -3677,15 +3677,19 @@ union bpf_attr {
>>   * 	Return
>>   * 		The id is returned or 0 in case the id could not be retrieved.
>>   *
>> - * long bpf_redirect_neigh(u32 ifindex, u64 flags)
>> + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)
>
> why not fold ifindex into params? with params and plen this should be
> extensible later if needed.

Figured this way would make it easier to run *without* the params (like
in the existing examples). But don't feel strongly about it, let's see
what Daniel thinks.

> A couple of nits below that caught me eye.

Thanks, will fix; the kernel bot also found a sparse warning, so I guess
I need to respin anyway (but waiting for Daniel's comments and/or
instructions on what tree to properly submit this to).

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 19:34     ` Toke Høiland-Jørgensen
@ 2020-10-19 13:09       ` Daniel Borkmann
  2020-10-19 13:28         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2020-10-19 13:09 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, David Ahern; +Cc: David Ahern, netdev, bpf

On 10/15/20 9:34 PM, Toke Høiland-Jørgensen wrote:
> David Ahern <dsahern@gmail.com> writes:
>> On 10/15/20 9:46 AM, Toke Høiland-Jørgensen wrote:
>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>> index bf5a99d803e4..980cc1363be8 100644
>>> --- a/include/uapi/linux/bpf.h
>>> +++ b/include/uapi/linux/bpf.h
>>> @@ -3677,15 +3677,19 @@ union bpf_attr {
>>>    * 	Return
>>>    * 		The id is returned or 0 in case the id could not be retrieved.
>>>    *
>>> - * long bpf_redirect_neigh(u32 ifindex, u64 flags)
>>> + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)
>>
>> why not fold ifindex into params? with params and plen this should be
>> extensible later if needed.
> 
> Figured this way would make it easier to run *without* the params (like
> in the existing examples). But don't feel strongly about it, let's see
> what Daniel thinks.

My preference is what Toke has here, this simplifies use by just being able to
call bpf_redirect_neigh(ifindex, NULL, 0, 0) when just single external facing
device is used.

>> A couple of nits below that caught me eye.
> 
> Thanks, will fix; the kernel bot also found a sparse warning, so I guess
> I need to respin anyway (but waiting for Daniel's comments and/or
> instructions on what tree to properly submit this to).

Given API change, lets do bpf. (Will review the rest later today.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-19 13:09       ` Daniel Borkmann
@ 2020-10-19 13:28         ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-19 13:28 UTC (permalink / raw)
  To: Daniel Borkmann, David Ahern; +Cc: David Ahern, netdev, bpf

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 10/15/20 9:34 PM, Toke Høiland-Jørgensen wrote:
>> David Ahern <dsahern@gmail.com> writes:
>>> On 10/15/20 9:46 AM, Toke Høiland-Jørgensen wrote:
>>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>>> index bf5a99d803e4..980cc1363be8 100644
>>>> --- a/include/uapi/linux/bpf.h
>>>> +++ b/include/uapi/linux/bpf.h
>>>> @@ -3677,15 +3677,19 @@ union bpf_attr {
>>>>    * 	Return
>>>>    * 		The id is returned or 0 in case the id could not be retrieved.
>>>>    *
>>>> - * long bpf_redirect_neigh(u32 ifindex, u64 flags)
>>>> + * long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params, int plen, u64 flags)
>>>
>>> why not fold ifindex into params? with params and plen this should be
>>> extensible later if needed.
>> 
>> Figured this way would make it easier to run *without* the params (like
>> in the existing examples). But don't feel strongly about it, let's see
>> what Daniel thinks.
>
> My preference is what Toke has here, this simplifies use by just being able to
> call bpf_redirect_neigh(ifindex, NULL, 0, 0) when just single external facing
> device is used.
>
>>> A couple of nits below that caught me eye.
>> 
>> Thanks, will fix; the kernel bot also found a sparse warning, so I guess
>> I need to respin anyway (but waiting for Daniel's comments and/or
>> instructions on what tree to properly submit this to).
>
> Given API change, lets do bpf. (Will review the rest later today.)

Right, ACK. I'll wait for your review, then resubmit against the bpf
tree :)

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
  2020-10-15 16:27   ` David Ahern
@ 2020-10-15 18:17   ` kernel test robot
  2020-10-15 20:59   ` kernel test robot
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2020-10-15 18:17 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 15230 bytes --]

Hi "Toke,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-s002-20201014 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-rc1-dirty
        # https://github.com/0day-ci/linux/commit/49323d0d5b278524de4bd9450ce2962a44311fec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
        git checkout 49323d0d5b278524de4bd9450ce2962a44311fec
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


"sparse warnings: (new ones prefixed by >>)"
   net/core/filter.c:430:33: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:433:33: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:436:33: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:439:33: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:442:33: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:516:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:519:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:522:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:1410:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct sock_filter const *filter @@     got struct sock_filter [noderef] __user *filter @@
   net/core/filter.c:1410:39: sparse:     expected struct sock_filter const *filter
   net/core/filter.c:1410:39: sparse:     got struct sock_filter [noderef] __user *filter
   net/core/filter.c:1488:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct sock_filter const *filter @@     got struct sock_filter [noderef] __user *filter @@
   net/core/filter.c:1488:39: sparse:     expected struct sock_filter const *filter
   net/core/filter.c:1488:39: sparse:     got struct sock_filter [noderef] __user *filter
>> net/core/filter.c:2314:45: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected restricted __be32 [usertype] daddr @@     got unsigned int [usertype] ipv4_nh @@
>> net/core/filter.c:2314:45: sparse:     expected restricted __be32 [usertype] daddr
>> net/core/filter.c:2314:45: sparse:     got unsigned int [usertype] ipv4_nh
   net/core/filter.c:7806:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:7809:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:7812:27: sparse: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:9643:31: sparse: sparse: symbol 'sk_filter_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9650:27: sparse: sparse: symbol 'sk_filter_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9654:31: sparse: sparse: symbol 'tc_cls_act_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9662:27: sparse: sparse: symbol 'tc_cls_act_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9666:31: sparse: sparse: symbol 'xdp_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9677:31: sparse: sparse: symbol 'cg_skb_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9683:27: sparse: sparse: symbol 'cg_skb_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9687:31: sparse: sparse: symbol 'lwt_in_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9693:27: sparse: sparse: symbol 'lwt_in_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9697:31: sparse: sparse: symbol 'lwt_out_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9703:27: sparse: sparse: symbol 'lwt_out_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9707:31: sparse: sparse: symbol 'lwt_xmit_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9714:27: sparse: sparse: symbol 'lwt_xmit_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9718:31: sparse: sparse: symbol 'lwt_seg6local_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9724:27: sparse: sparse: symbol 'lwt_seg6local_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9728:31: sparse: sparse: symbol 'cg_sock_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9734:27: sparse: sparse: symbol 'cg_sock_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9737:31: sparse: sparse: symbol 'cg_sock_addr_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9743:27: sparse: sparse: symbol 'cg_sock_addr_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9746:31: sparse: sparse: symbol 'sock_ops_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9752:27: sparse: sparse: symbol 'sock_ops_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9755:31: sparse: sparse: symbol 'sk_skb_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9762:27: sparse: sparse: symbol 'sk_skb_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9765:31: sparse: sparse: symbol 'sk_msg_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9772:27: sparse: sparse: symbol 'sk_msg_prog_ops' was not declared. Should it be static?
   net/core/filter.c:9775:31: sparse: sparse: symbol 'flow_dissector_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:9781:27: sparse: sparse: symbol 'flow_dissector_prog_ops' was not declared. Should it be static?
   net/core/filter.c:10087:31: sparse: sparse: symbol 'sk_reuseport_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:10093:27: sparse: sparse: symbol 'sk_reuseport_prog_ops' was not declared. Should it be static?
   net/core/filter.c:10269:27: sparse: sparse: symbol 'sk_lookup_prog_ops' was not declared. Should it be static?
   net/core/filter.c:10272:31: sparse: sparse: symbol 'sk_lookup_verifier_ops' was not declared. Should it be static?
   net/core/filter.c:245:32: sparse: sparse: cast to restricted __be16
   net/core/filter.c:245:32: sparse: sparse: cast to restricted __be16
   net/core/filter.c:245:32: sparse: sparse: cast to restricted __be16
   net/core/filter.c:245:32: sparse: sparse: cast to restricted __be16
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:272:32: sparse: sparse: cast to restricted __be32
   net/core/filter.c:1912:43: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected restricted __wsum [usertype] diff @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1912:43: sparse:     expected restricted __wsum [usertype] diff
   net/core/filter.c:1912:43: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:1915:36: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected restricted __be16 [usertype] old @@     got unsigned long long [usertype] from @@
   net/core/filter.c:1915:36: sparse:     expected restricted __be16 [usertype] old
   net/core/filter.c:1915:36: sparse:     got unsigned long long [usertype] from
   net/core/filter.c:1915:42: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected restricted __be16 [usertype] new @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1915:42: sparse:     expected restricted __be16 [usertype] new
   net/core/filter.c:1915:42: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:1918:36: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected restricted __be32 [usertype] from @@     got unsigned long long [usertype] from @@
   net/core/filter.c:1918:36: sparse:     expected restricted __be32 [usertype] from
   net/core/filter.c:1918:36: sparse:     got unsigned long long [usertype] from
   net/core/filter.c:1918:42: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected restricted __be32 [usertype] to @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1918:42: sparse:     expected restricted __be32 [usertype] to
   net/core/filter.c:1918:42: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:1963:59: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected restricted __wsum [usertype] diff @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1963:59: sparse:     expected restricted __wsum [usertype] diff
   net/core/filter.c:1963:59: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:1966:52: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected restricted __be16 [usertype] from @@     got unsigned long long [usertype] from @@
   net/core/filter.c:1966:52: sparse:     expected restricted __be16 [usertype] from
   net/core/filter.c:1966:52: sparse:     got unsigned long long [usertype] from
   net/core/filter.c:1966:58: sparse: sparse: incorrect type in argument 4 (different base types) @@     expected restricted __be16 [usertype] to @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1966:58: sparse:     expected restricted __be16 [usertype] to
   net/core/filter.c:1966:58: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:1969:52: sparse: sparse: incorrect type in argument 3 (different base types) @@     expected restricted __be32 [usertype] from @@     got unsigned long long [usertype] from @@
   net/core/filter.c:1969:52: sparse:     expected restricted __be32 [usertype] from
   net/core/filter.c:1969:52: sparse:     got unsigned long long [usertype] from
   net/core/filter.c:1969:58: sparse: sparse: incorrect type in argument 4 (different base types) @@     expected restricted __be32 [usertype] to @@     got unsigned long long [usertype] to @@
   net/core/filter.c:1969:58: sparse:     expected restricted __be32 [usertype] to
   net/core/filter.c:1969:58: sparse:     got unsigned long long [usertype] to
   net/core/filter.c:2015:28: sparse: sparse: incorrect type in return expression (different base types) @@     expected unsigned long long @@     got restricted __wsum @@
   net/core/filter.c:2015:28: sparse:     expected unsigned long long
   net/core/filter.c:2015:28: sparse:     got restricted __wsum
   net/core/filter.c:2037:35: sparse: sparse: incorrect type in return expression (different base types) @@     expected unsigned long long @@     got restricted __wsum [usertype] csum @@
   net/core/filter.c:2037:35: sparse:     expected unsigned long long
   net/core/filter.c:2037:35: sparse:     got restricted __wsum [usertype] csum
   net/core/filter.c: note: in included file (through include/net/ip.h):
   include/net/route.h:372:48: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected unsigned int [usertype] key @@     got restricted __be32 [usertype] daddr @@
   include/net/route.h:372:48: sparse:     expected unsigned int [usertype] key
   include/net/route.h:372:48: sparse:     got restricted __be32 [usertype] daddr
   include/net/route.h:372:48: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected unsigned int [usertype] key @@     got restricted __be32 [usertype] daddr @@
   include/net/route.h:372:48: sparse:     expected unsigned int [usertype] key
   include/net/route.h:372:48: sparse:     got restricted __be32 [usertype] daddr
   include/net/route.h:372:48: sparse: sparse: incorrect type in argument 2 (different base types) @@     expected unsigned int [usertype] key @@     got restricted __be32 [usertype] daddr @@
   include/net/route.h:372:48: sparse:     expected unsigned int [usertype] key
   include/net/route.h:372:48: sparse:     got restricted __be32 [usertype] daddr
>> net/core/filter.c:2331:9: sparse: sparse: context imbalance in 'bpf_out_neigh_v4' - different lock contexts for basic block

vim +2314 net/core/filter.c

  2273	
  2274	#if IS_ENABLED(CONFIG_INET)
  2275	static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb,
  2276				    struct net_device *dev, struct bpf_nh_params *nh)
  2277	{
  2278		u32 hh_len = LL_RESERVED_SPACE(dev);
  2279		struct neighbour *neigh;
  2280		bool is_v6gw = false;
  2281	
  2282		if (dev_xmit_recursion()) {
  2283			net_crit_ratelimited("bpf: recursion limit reached on datapath, buggy bpf program?\n");
  2284			goto out_drop;
  2285		}
  2286	
  2287		skb->dev = dev;
  2288		skb->tstamp = 0;
  2289	
  2290		if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
  2291			struct sk_buff *skb2;
  2292	
  2293			skb2 = skb_realloc_headroom(skb, hh_len);
  2294			if (unlikely(!skb2)) {
  2295				kfree_skb(skb);
  2296				return -ENOMEM;
  2297			}
  2298			if (skb->sk)
  2299				skb_set_owner_w(skb2, skb->sk);
  2300			consume_skb(skb);
  2301			skb = skb2;
  2302		}
  2303	
  2304		rcu_read_lock_bh();
  2305		if (!nh) {
  2306			struct dst_entry *dst = skb_dst(skb);
  2307			struct rtable *rt = container_of(dst, struct rtable, dst);
  2308	
  2309			neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
  2310		} else if (nh->nh_family == AF_INET6) {
  2311			neigh = ip_neigh_gw6(dev, &nh->ipv6_nh);
  2312			is_v6gw = true;
  2313		} else if (nh->nh_family == AF_INET) {
> 2314			neigh = ip_neigh_gw4(dev, nh->ipv4_nh);
  2315		} else {
  2316			goto out_drop;
  2317		}
  2318	
  2319		if (likely(!IS_ERR(neigh))) {
  2320			int ret;
  2321	
  2322			sock_confirm_neigh(skb, neigh);
  2323			dev_xmit_recursion_inc();
  2324			ret = neigh_output(neigh, skb, is_v6gw);
  2325			dev_xmit_recursion_dec();
  2326			rcu_read_unlock_bh();
  2327			return ret;
  2328		}
  2329		rcu_read_unlock_bh();
  2330	out_drop:
> 2331		kfree_skb(skb);
  2332		return -ENETDOWN;
  2333	}
  2334	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35511 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
  2020-10-15 16:27   ` David Ahern
  2020-10-15 18:17   ` kernel test robot
@ 2020-10-15 20:59   ` kernel test robot
  2020-10-15 21:27   ` kernel test robot
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2020-10-15 20:59 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5161 bytes --]

Hi "Toke,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: h8300-randconfig-r001-20201014 (attached as .config)
compiler: h8300-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/49323d0d5b278524de4bd9450ce2962a44311fec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
        git checkout 49323d0d5b278524de4bd9450ce2962a44311fec
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=h8300 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:11,
                    from include/linux/list.h:9,
                    from include/linux/module.h:12,
                    from net/core/filter.c:20:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   include/asm-generic/page.h:93:50: warning: ordered comparison of pointer with null pointer [-Wextra]
      93 | #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
         |                                                  ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
      78 | # define unlikely(x) __builtin_expect(!!(x), 0)
         |                                          ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |  ^~~~~~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |          ^~~~~~~~~~~~~~~
   net/core/filter.c: In function '__bpf_redirect_neigh':
>> net/core/filter.c:2404:10: error: too many arguments to function '__bpf_redirect_neigh_v6'
    2404 |   return __bpf_redirect_neigh_v6(skb, dev, nh);
         |          ^~~~~~~~~~~~~~~~~~~~~~~
   net/core/filter.c:2267:12: note: declared here
    2267 | static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~
   In file included from arch/h8300/include/asm/atomic.h:7,
                    from include/linux/atomic.h:7,
                    from include/asm-generic/bitops/lock.h:5,
                    from arch/h8300/include/asm/bitops.h:170,
                    from include/linux/bitops.h:29,
                    from include/linux/kernel.h:12,
                    from include/linux/list.h:9,
                    from include/linux/module.h:12,
                    from net/core/filter.c:20:
   net/core/filter.c: In function 'bpf_clear_redirect_map':
   arch/h8300/include/asm/cmpxchg.h:54:3: warning: value computed is not used [-Wunused-value]
      54 |  ((__typeof__(*(ptr)))__cmpxchg_local_generic((ptr),   \
         |  ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      55 |            (unsigned long)(o), \
         |            ~~~~~~~~~~~~~~~~~~~~~
      56 |            (unsigned long)(n), \
         |            ~~~~~~~~~~~~~~~~~~~~~
      57 |            sizeof(*(ptr))))
         |            ~~~~~~~~~~~~~~~~
   include/asm-generic/cmpxchg.h:106:28: note: in expansion of macro 'cmpxchg_local'
     106 | #define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
         |                            ^~~~~~~~~~~~~
   net/core/filter.c:3978:4: note: in expansion of macro 'cmpxchg'
    3978 |    cmpxchg(&ri->map, map, NULL);
         |    ^~~~~~~

vim +/__bpf_redirect_neigh_v6 +2404 net/core/filter.c

  2385	
  2386	static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev,
  2387					struct bpf_nh_params *nh)
  2388	{
  2389		struct ethhdr *ethh = eth_hdr(skb);
  2390	
  2391		if (unlikely(skb->mac_header >= skb->network_header))
  2392			goto out;
  2393		bpf_push_mac_rcsum(skb);
  2394		if (is_multicast_ether_addr(ethh->h_dest))
  2395			goto out;
  2396	
  2397		skb_pull(skb, sizeof(*ethh));
  2398		skb_unset_mac_header(skb);
  2399		skb_reset_network_header(skb);
  2400	
  2401		if (skb->protocol == htons(ETH_P_IP))
  2402			return __bpf_redirect_neigh_v4(skb, dev, nh);
  2403		else if (skb->protocol == htons(ETH_P_IPV6))
> 2404			return __bpf_redirect_neigh_v6(skb, dev, nh);
  2405	out:
  2406		kfree_skb(skb);
  2407		return -ENOTSUPP;
  2408	}
  2409	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 24806 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
                     ` (2 preceding siblings ...)
  2020-10-15 20:59   ` kernel test robot
@ 2020-10-15 21:27   ` kernel test robot
  2020-10-19 14:48   ` Daniel Borkmann
  2020-10-19 15:01   ` Daniel Borkmann
  5 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2020-10-15 21:27 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 9050 bytes --]

Hi "Toke,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: powerpc-randconfig-r006-20201014 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project e7b4feea8e1bf520b34ad8c116abab6677344b74)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # https://github.com/0day-ci/linux/commit/49323d0d5b278524de4bd9450ce2962a44311fec
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Toke-H-iland-J-rgensen/bpf-Rework-bpf_redirect_neigh-to-allow-supplying-nexthop-from-caller/20201015-234710
        git checkout 49323d0d5b278524de4bd9450ce2962a44311fec
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   __do_insb
   ^
   arch/powerpc/include/asm/io.h:541:56: note: expanded from macro '__do_insb'
   #define __do_insb(p, b, n)      readsb((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
                                          ~~~~~~~~~~~~~~~~~~~~~^
   In file included from net/core/filter.c:25:
   In file included from include/linux/sock_diag.h:5:
   In file included from include/linux/netlink.h:7:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:141:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)      readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
                                          ~~~~~~~~~~~~~~~~~~~~~^
   In file included from net/core/filter.c:25:
   In file included from include/linux/sock_diag.h:5:
   In file included from include/linux/netlink.h:7:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:146:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)      readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
                                          ~~~~~~~~~~~~~~~~~~~~~^
   In file included from net/core/filter.c:25:
   In file included from include/linux/sock_diag.h:5:
   In file included from include/linux/netlink.h:7:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:151:1: note: expanded from here
   __do_outsb
   ^
   arch/powerpc/include/asm/io.h:544:58: note: expanded from macro '__do_outsb'
   #define __do_outsb(p, b, n)     writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
   In file included from net/core/filter.c:25:
   In file included from include/linux/sock_diag.h:5:
   In file included from include/linux/netlink.h:7:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:51:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:156:1: note: expanded from here
   __do_outsw
   ^
   arch/powerpc/include/asm/io.h:545:58: note: expanded from macro '__do_outsw'
   #define __do_outsw(p, b, n)     writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
   In file included from net/core/filter.c:25:
   In file included from include/linux/sock_diag.h:5:
   In file included from include/linux/netlink.h:7:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:11:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:53:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c),
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 'DEF_PCI_AC_NORET'
                   __do_##name al;                                 \
                   ^~~~~~~~~~~~~~
   <scratch space>:161:1: note: expanded from here
   __do_outsl
   ^
   arch/powerpc/include/asm/io.h:546:58: note: expanded from macro '__do_outsl'
   #define __do_outsl(p, b, n)     writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
                                           ~~~~~~~~~~~~~~~~~~~~~^
>> net/core/filter.c:2404:44: error: too many arguments to function call, expected 2, have 3
                   return __bpf_redirect_neigh_v6(skb, dev, nh);
                          ~~~~~~~~~~~~~~~~~~~~~~~           ^~
   net/core/filter.c:2267:12: note: '__bpf_redirect_neigh_v6' declared here
   static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
              ^
   6 warnings and 1 error generated.

vim +2404 net/core/filter.c

  2385	
  2386	static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev,
  2387					struct bpf_nh_params *nh)
  2388	{
  2389		struct ethhdr *ethh = eth_hdr(skb);
  2390	
  2391		if (unlikely(skb->mac_header >= skb->network_header))
  2392			goto out;
  2393		bpf_push_mac_rcsum(skb);
  2394		if (is_multicast_ether_addr(ethh->h_dest))
  2395			goto out;
  2396	
  2397		skb_pull(skb, sizeof(*ethh));
  2398		skb_unset_mac_header(skb);
  2399		skb_reset_network_header(skb);
  2400	
  2401		if (skb->protocol == htons(ETH_P_IP))
  2402			return __bpf_redirect_neigh_v4(skb, dev, nh);
  2403		else if (skb->protocol == htons(ETH_P_IPV6))
> 2404			return __bpf_redirect_neigh_v6(skb, dev, nh);
  2405	out:
  2406		kfree_skb(skb);
  2407		return -ENOTSUPP;
  2408	}
  2409	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 29713 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
                     ` (3 preceding siblings ...)
  2020-10-15 21:27   ` kernel test robot
@ 2020-10-19 14:48   ` Daniel Borkmann
  2020-10-19 14:56     ` Toke Høiland-Jørgensen
  2020-10-19 15:01   ` Daniel Borkmann
  5 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2020-10-19 14:48 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: David Ahern, netdev, bpf

On 10/15/20 5:46 PM, Toke Høiland-Jørgensen wrote:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> 
> Based on the discussion in [0], update the bpf_redirect_neigh() helper to
> accept an optional parameter specifying the nexthop information. This makes
> it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without
> incurring a duplicate FIB lookup - since the FIB lookup helper will return
> the nexthop information even if no neighbour is present, this can simply be
> passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns
> BPF_FIB_LKUP_RET_NO_NEIGH.
> 
> [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>

Overall looks good from what I can tell, just small nits below on top of
David's feedback:

[...]
> -static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
> +static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
> +				   struct bpf_nh_params *nh)
>   {
>   	const struct iphdr *ip4h = ip_hdr(skb);
>   	struct net *net = dev_net(dev);
>   	int err, ret = NET_XMIT_DROP;
> -	struct rtable *rt;
> -	struct flowi4 fl4 = {
> -		.flowi4_flags	= FLOWI_FLAG_ANYSRC,
> -		.flowi4_mark	= skb->mark,
> -		.flowi4_tos	= RT_TOS(ip4h->tos),
> -		.flowi4_oif	= dev->ifindex,
> -		.flowi4_proto	= ip4h->protocol,
> -		.daddr		= ip4h->daddr,
> -		.saddr		= ip4h->saddr,
> -	};
>   
> -	rt = ip_route_output_flow(net, &fl4, NULL);
> -	if (IS_ERR(rt))
> -		goto out_drop;
> -	if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
> -		ip_rt_put(rt);
> -		goto out_drop;
> -	}
> +	if (!nh->nh_family) {
> +		struct rtable *rt;
> +		struct flowi4 fl4 = {
> +			.flowi4_flags = FLOWI_FLAG_ANYSRC,
> +			.flowi4_mark = skb->mark,
> +			.flowi4_tos = RT_TOS(ip4h->tos),
> +			.flowi4_oif = dev->ifindex,
> +			.flowi4_proto = ip4h->protocol,
> +			.daddr = ip4h->daddr,
> +			.saddr = ip4h->saddr,
> +		};
> +
> +		rt = ip_route_output_flow(net, &fl4, NULL);
> +		if (IS_ERR(rt))
> +			goto out_drop;
> +		if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
> +			ip_rt_put(rt);
> +			goto out_drop;
> +		}
>   
> -	skb_dst_set(skb, &rt->dst);
> +		skb_dst_set(skb, &rt->dst);
> +		nh = NULL;
> +	}
>   
> -	err = bpf_out_neigh_v4(net, skb);
> +	err = bpf_out_neigh_v4(net, skb, dev, nh);
>   	if (unlikely(net_xmit_eval(err)))
>   		dev->stats.tx_errors++;
>   	else
> @@ -2355,7 +2383,8 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
>   }
>   #endif /* CONFIG_INET */
>   
> -static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
> +static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev,
> +				struct bpf_nh_params *nh)
>   {
>   	struct ethhdr *ethh = eth_hdr(skb);
>   
> @@ -2370,9 +2399,9 @@ static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
>   	skb_reset_network_header(skb);
>   
>   	if (skb->protocol == htons(ETH_P_IP))
> -		return __bpf_redirect_neigh_v4(skb, dev);
> +		return __bpf_redirect_neigh_v4(skb, dev, nh);
>   	else if (skb->protocol == htons(ETH_P_IPV6))
> -		return __bpf_redirect_neigh_v6(skb, dev);
> +		return __bpf_redirect_neigh_v6(skb, dev, nh);
>   out:
>   	kfree_skb(skb);
>   	return -ENOTSUPP;
> @@ -2455,8 +2484,8 @@ int skb_do_redirect(struct sk_buff *skb)
>   		return -EAGAIN;
>   	}
>   	return flags & BPF_F_NEIGH ?
> -	       __bpf_redirect_neigh(skb, dev) :
> -	       __bpf_redirect(skb, dev, flags);
> +		__bpf_redirect_neigh(skb, dev, &ri->nh) :
> +		__bpf_redirect(skb, dev, flags);
>   out_drop:
>   	kfree_skb(skb);
>   	return -EINVAL;
> @@ -2504,16 +2533,23 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = {
>   	.arg2_type      = ARG_ANYTHING,
>   };
>   
> -BPF_CALL_2(bpf_redirect_neigh, u32, ifindex, u64, flags)
> +BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params,
> +	   int, plen, u64, flags)
>   {
>   	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
>   
> -	if (unlikely(flags))
> +	if (unlikely((plen && plen < sizeof(*params)) || flags))
>   		return TC_ACT_SHOT;
>   
>   	ri->flags = BPF_F_NEIGH;
>   	ri->tgt_index = ifindex;
>   
> +	BUILD_BUG_ON(sizeof(struct bpf_redir_neigh) != sizeof(struct bpf_nh_params));
> +	if (plen)
> +		memcpy(&ri->nh, params, sizeof(ri->nh));
> +	else
> +		ri->nh.nh_family = 0; /* clear previous value */

I'd probably just add an internal flag and do ...

   ri->flags = BPF_F_NEIGH | (plen ? BPF_F_NEXTHOP : 0);

... instead of above clearing, and skb_do_redirect() then becomes:

   __bpf_redirect_neigh(skb, dev, flags & BPF_F_NEXTHOP ? &ri->nh : NULL)

... which would then also avoid this !nh->nh_family check where you later on
set nh = NULL to pass it onwards.

>   	return TC_ACT_REDIRECT;
>   }
>   

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-19 14:48   ` Daniel Borkmann
@ 2020-10-19 14:56     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-19 14:56 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Ahern, netdev, bpf

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 10/15/20 5:46 PM, Toke Høiland-Jørgensen wrote:
>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>> 
>> Based on the discussion in [0], update the bpf_redirect_neigh() helper to
>> accept an optional parameter specifying the nexthop information. This makes
>> it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without
>> incurring a duplicate FIB lookup - since the FIB lookup helper will return
>> the nexthop information even if no neighbour is present, this can simply be
>> passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns
>> BPF_FIB_LKUP_RET_NO_NEIGH.
>> 
>> [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/
>> 
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> Overall looks good from what I can tell, just small nits below on top of
> David's feedback:
>
> [...]
>> -static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
>> +static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
>> +				   struct bpf_nh_params *nh)
>>   {
>>   	const struct iphdr *ip4h = ip_hdr(skb);
>>   	struct net *net = dev_net(dev);
>>   	int err, ret = NET_XMIT_DROP;
>> -	struct rtable *rt;
>> -	struct flowi4 fl4 = {
>> -		.flowi4_flags	= FLOWI_FLAG_ANYSRC,
>> -		.flowi4_mark	= skb->mark,
>> -		.flowi4_tos	= RT_TOS(ip4h->tos),
>> -		.flowi4_oif	= dev->ifindex,
>> -		.flowi4_proto	= ip4h->protocol,
>> -		.daddr		= ip4h->daddr,
>> -		.saddr		= ip4h->saddr,
>> -	};
>>   
>> -	rt = ip_route_output_flow(net, &fl4, NULL);
>> -	if (IS_ERR(rt))
>> -		goto out_drop;
>> -	if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
>> -		ip_rt_put(rt);
>> -		goto out_drop;
>> -	}
>> +	if (!nh->nh_family) {
>> +		struct rtable *rt;
>> +		struct flowi4 fl4 = {
>> +			.flowi4_flags = FLOWI_FLAG_ANYSRC,
>> +			.flowi4_mark = skb->mark,
>> +			.flowi4_tos = RT_TOS(ip4h->tos),
>> +			.flowi4_oif = dev->ifindex,
>> +			.flowi4_proto = ip4h->protocol,
>> +			.daddr = ip4h->daddr,
>> +			.saddr = ip4h->saddr,
>> +		};
>> +
>> +		rt = ip_route_output_flow(net, &fl4, NULL);
>> +		if (IS_ERR(rt))
>> +			goto out_drop;
>> +		if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
>> +			ip_rt_put(rt);
>> +			goto out_drop;
>> +		}
>>   
>> -	skb_dst_set(skb, &rt->dst);
>> +		skb_dst_set(skb, &rt->dst);
>> +		nh = NULL;
>> +	}
>>   
>> -	err = bpf_out_neigh_v4(net, skb);
>> +	err = bpf_out_neigh_v4(net, skb, dev, nh);
>>   	if (unlikely(net_xmit_eval(err)))
>>   		dev->stats.tx_errors++;
>>   	else
>> @@ -2355,7 +2383,8 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev)
>>   }
>>   #endif /* CONFIG_INET */
>>   
>> -static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
>> +static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev,
>> +				struct bpf_nh_params *nh)
>>   {
>>   	struct ethhdr *ethh = eth_hdr(skb);
>>   
>> @@ -2370,9 +2399,9 @@ static int __bpf_redirect_neigh(struct sk_buff *skb, struct net_device *dev)
>>   	skb_reset_network_header(skb);
>>   
>>   	if (skb->protocol == htons(ETH_P_IP))
>> -		return __bpf_redirect_neigh_v4(skb, dev);
>> +		return __bpf_redirect_neigh_v4(skb, dev, nh);
>>   	else if (skb->protocol == htons(ETH_P_IPV6))
>> -		return __bpf_redirect_neigh_v6(skb, dev);
>> +		return __bpf_redirect_neigh_v6(skb, dev, nh);
>>   out:
>>   	kfree_skb(skb);
>>   	return -ENOTSUPP;
>> @@ -2455,8 +2484,8 @@ int skb_do_redirect(struct sk_buff *skb)
>>   		return -EAGAIN;
>>   	}
>>   	return flags & BPF_F_NEIGH ?
>> -	       __bpf_redirect_neigh(skb, dev) :
>> -	       __bpf_redirect(skb, dev, flags);
>> +		__bpf_redirect_neigh(skb, dev, &ri->nh) :
>> +		__bpf_redirect(skb, dev, flags);
>>   out_drop:
>>   	kfree_skb(skb);
>>   	return -EINVAL;
>> @@ -2504,16 +2533,23 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = {
>>   	.arg2_type      = ARG_ANYTHING,
>>   };
>>   
>> -BPF_CALL_2(bpf_redirect_neigh, u32, ifindex, u64, flags)
>> +BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params,
>> +	   int, plen, u64, flags)
>>   {
>>   	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
>>   
>> -	if (unlikely(flags))
>> +	if (unlikely((plen && plen < sizeof(*params)) || flags))
>>   		return TC_ACT_SHOT;
>>   
>>   	ri->flags = BPF_F_NEIGH;
>>   	ri->tgt_index = ifindex;
>>   
>> +	BUILD_BUG_ON(sizeof(struct bpf_redir_neigh) != sizeof(struct bpf_nh_params));
>> +	if (plen)
>> +		memcpy(&ri->nh, params, sizeof(ri->nh));
>> +	else
>> +		ri->nh.nh_family = 0; /* clear previous value */
>
> I'd probably just add an internal flag and do ...
>
>    ri->flags = BPF_F_NEIGH | (plen ? BPF_F_NEXTHOP : 0);
>
> ... instead of above clearing, and skb_do_redirect() then becomes:
>
>    __bpf_redirect_neigh(skb, dev, flags & BPF_F_NEXTHOP ? &ri->nh : NULL)
>
> ... which would then also avoid this !nh->nh_family check where you later on
> set nh = NULL to pass it onwards.

Ah yes, excellent idea! Will fix :)

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
                     ` (4 preceding siblings ...)
  2020-10-19 14:48   ` Daniel Borkmann
@ 2020-10-19 15:01   ` Daniel Borkmann
  5 siblings, 0 replies; 15+ messages in thread
From: Daniel Borkmann @ 2020-10-19 15:01 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: David Ahern, netdev, bpf

On 10/15/20 5:46 PM, Toke Høiland-Jørgensen wrote:
[...]
> +struct bpf_redir_neigh {
> +	/* network family for lookup (AF_INET, AF_INET6)
> +	 */
> +	__u8	nh_family;
> +	/* network address of nexthop; skips fib lookup to find gateway */
> +	union {
> +		__be32		ipv4_nh;
> +		__u32		ipv6_nh[4];  /* in6_addr; network order */
> +	};
> +};
> +
>   enum bpf_task_fd_type {
>   	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
>   	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c5e2a1c5fd8d..d073031a3a61 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2165,12 +2165,11 @@ static int __bpf_redirect(struct sk_buff *skb, struct net_device *dev,
>   }
>   
>   #if IS_ENABLED(CONFIG_IPV6)
> -static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
> +static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb,
> +			    struct net_device *dev, const struct in6_addr *nexthop)
>   {
> -	struct dst_entry *dst = skb_dst(skb);
> -	struct net_device *dev = dst->dev;
>   	u32 hh_len = LL_RESERVED_SPACE(dev);
> -	const struct in6_addr *nexthop;
> +	struct dst_entry *dst = NULL;
>   	struct neighbour *neigh;
>   
>   	if (dev_xmit_recursion()) {
> @@ -2196,8 +2195,11 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
>   	}
>   
>   	rcu_read_lock_bh();
> -	nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
> -			      &ipv6_hdr(skb)->daddr);
> +	if (!nexthop) {
> +		dst = skb_dst(skb);
> +		nexthop = rt6_nexthop(container_of(dst, struct rt6_info, dst),
> +				      &ipv6_hdr(skb)->daddr);
> +	}
>   	neigh = ip_neigh_gw6(dev, nexthop);
>   	if (likely(!IS_ERR(neigh))) {
>   		int ret;
> @@ -2210,36 +2212,46 @@ static int bpf_out_neigh_v6(struct net *net, struct sk_buff *skb)
>   		return ret;
>   	}
>   	rcu_read_unlock_bh();
> -	IP6_INC_STATS(dev_net(dst->dev),
> -		      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
> +	if (dst)
> +		IP6_INC_STATS(dev_net(dst->dev),
> +			      ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
>   out_drop:
>   	kfree_skb(skb);
>   	return -ENETDOWN;
>   }
>   
> -static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
> +static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
> +				   struct bpf_nh_params *nh)
>   {
>   	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
> +	struct in6_addr *nexthop = NULL;
>   	struct net *net = dev_net(dev);
>   	int err, ret = NET_XMIT_DROP;
> -	struct dst_entry *dst;
> -	struct flowi6 fl6 = {
> -		.flowi6_flags	= FLOWI_FLAG_ANYSRC,
> -		.flowi6_mark	= skb->mark,
> -		.flowlabel	= ip6_flowinfo(ip6h),
> -		.flowi6_oif	= dev->ifindex,
> -		.flowi6_proto	= ip6h->nexthdr,
> -		.daddr		= ip6h->daddr,
> -		.saddr		= ip6h->saddr,
> -	};
>   
> -	dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
> -	if (IS_ERR(dst))
> -		goto out_drop;
> +	if (!nh->nh_family) {
> +		struct dst_entry *dst;
> +		struct flowi6 fl6 = {
> +			.flowi6_flags = FLOWI_FLAG_ANYSRC,
> +			.flowi6_mark = skb->mark,
> +			.flowlabel = ip6_flowinfo(ip6h),
> +			.flowi6_oif = dev->ifindex,
> +			.flowi6_proto = ip6h->nexthdr,
> +			.daddr = ip6h->daddr,
> +			.saddr = ip6h->saddr,

nit: Would be good for readability to keep the previous whitespace alignment intact.

> +		};
> +
> +		dst = ipv6_stub->ipv6_dst_lookup_flow(net, NULL, &fl6, NULL);
> +		if (IS_ERR(dst))
> +			goto out_drop;
>   
> -	skb_dst_set(skb, dst);
> +		skb_dst_set(skb, dst);
> +	} else if (nh->nh_family == AF_INET6) {
> +		nexthop = &nh->ipv6_nh;
> +	} else {
> +		goto out_drop;
> +	}
>   
> -	err = bpf_out_neigh_v6(net, skb);
> +	err = bpf_out_neigh_v6(net, skb, dev, nexthop);

I'd probably model the bpf_out_neigh_v{4,6}() as close as possible similar to each other in terms
of args we pass etc. In the v6 case you pass the nexthop in6_addr directly whereas v4 passes
bpf_nh_params, I'd probably also stick to the latter for v6 to keep it symmetric.

>   	if (unlikely(net_xmit_eval(err)))
>   		dev->stats.tx_errors++;
>   	else
> @@ -2260,11 +2272,9 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev)
>   #endif /* CONFIG_IPV6 */
>   
>   #if IS_ENABLED(CONFIG_INET)
> -static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb)
> +static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb,
> +			    struct net_device *dev, struct bpf_nh_params *nh)
>   {
> -	struct dst_entry *dst = skb_dst(skb);
> -	struct rtable *rt = container_of(dst, struct rtable, dst);
> -	struct net_device *dev = dst->dev;
>   	u32 hh_len = LL_RESERVED_SPACE(dev);
>   	struct neighbour *neigh;
>   	bool is_v6gw = false;
> @@ -2292,7 +2302,20 @@ static int bpf_out_neigh_v4(struct net *net, struct sk_buff *skb)
>   	}
>   
>   	rcu_read_lock_bh();
> -	neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
> +	if (!nh) {
> +		struct dst_entry *dst = skb_dst(skb);
> +		struct rtable *rt = container_of(dst, struct rtable, dst);
> +
> +		neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
> +	} else if (nh->nh_family == AF_INET6) {
> +		neigh = ip_neigh_gw6(dev, &nh->ipv6_nh);
> +		is_v6gw = true;
> +	} else if (nh->nh_family == AF_INET) {
> +		neigh = ip_neigh_gw4(dev, nh->ipv4_nh);
> +	} else {
> +		goto out_drop;
> +	}
> +
>   	if (likely(!IS_ERR(neigh))) {
>   		int ret;
>   

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh()
  2020-10-15 15:46 [PATCH RFC bpf-next 0/2] bpf: Rework bpf_redirect_neigh() to allow supplying nexthop from caller Toke Høiland-Jørgensen
  2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
@ 2020-10-15 15:46 ` Toke Høiland-Jørgensen
  2020-10-19 14:40   ` Daniel Borkmann
  1 sibling, 1 reply; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-15 15:46 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Ahern, netdev, bpf

From: Toke Høiland-Jørgensen <toke@redhat.com>

This updates the test_tc_neigh selftest to use the new syntax of
bpf_redirect_neigh(). To exercise the helper both with and without the
optional parameter, one forwarding direction is changed to do a
bpf_fib_lookup() followed by a call to bpf_redirect_neigh(), while the
other direction is using the map-based ifindex lookup letting the redirect
helper resolve the nexthop from the FIB.

This also fixes the test_tc_redirect.sh script to work on systems that have
a consolidated dual-stack 'ping' binary instead of separate ping/ping6
versions.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 tools/testing/selftests/bpf/progs/test_tc_neigh.c |   83 ++++++++++++++++++---
 tools/testing/selftests/bpf/test_tc_redirect.sh   |    8 +-
 2 files changed, 78 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_tc_neigh.c b/tools/testing/selftests/bpf/progs/test_tc_neigh.c
index fe182616b112..ba03e603ba9b 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_neigh.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_neigh.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <stdint.h>
 #include <stdbool.h>
+#include <stddef.h>
 
 #include <linux/bpf.h>
 #include <linux/stddef.h>
@@ -32,6 +33,9 @@
 				 a.s6_addr32[3] == b.s6_addr32[3])
 #endif
 
+#define AF_INET 2
+#define AF_INET6 10
+
 enum {
 	dev_src,
 	dev_dst,
@@ -45,7 +49,8 @@ struct bpf_map_def SEC("maps") ifindex_map = {
 };
 
 static __always_inline bool is_remote_ep_v4(struct __sk_buff *skb,
-					    __be32 addr)
+					    __be32 addr,
+					    struct bpf_fib_lookup *fib_params)
 {
 	void *data_end = ctx_ptr(skb->data_end);
 	void *data = ctx_ptr(skb->data);
@@ -58,11 +63,26 @@ static __always_inline bool is_remote_ep_v4(struct __sk_buff *skb,
 	if ((void *)(ip4h + 1) > data_end)
 		return false;
 
-	return ip4h->daddr == addr;
+	if (ip4h->daddr != addr)
+		return false;
+
+	if (fib_params) {
+		fib_params->family = AF_INET;
+		fib_params->tos = ip4h->tos;
+		fib_params->l4_protocol = ip4h->protocol;
+		fib_params->sport = 0;
+		fib_params->dport = 0;
+		fib_params->tot_len = bpf_ntohs(ip4h->tot_len);
+		fib_params->ipv4_src = ip4h->saddr;
+		fib_params->ipv4_dst = ip4h->daddr;
+	}
+
+	return true;
 }
 
 static __always_inline bool is_remote_ep_v6(struct __sk_buff *skb,
-					    struct in6_addr addr)
+					    struct in6_addr addr,
+					    struct bpf_fib_lookup *fib_params)
 {
 	void *data_end = ctx_ptr(skb->data_end);
 	void *data = ctx_ptr(skb->data);
@@ -75,7 +95,24 @@ static __always_inline bool is_remote_ep_v6(struct __sk_buff *skb,
 	if ((void *)(ip6h + 1) > data_end)
 		return false;
 
-	return v6_equal(ip6h->daddr, addr);
+	if (!v6_equal(ip6h->daddr, addr))
+		return false;
+
+	if (fib_params) {
+		struct in6_addr *src = (struct in6_addr *)fib_params->ipv6_src;
+		struct in6_addr *dst = (struct in6_addr *)fib_params->ipv6_dst;
+
+		fib_params->family = AF_INET6;
+		fib_params->flowinfo = 0;
+		fib_params->l4_protocol = ip6h->nexthdr;
+		fib_params->sport = 0;
+		fib_params->dport = 0;
+		fib_params->tot_len = bpf_ntohs(ip6h->payload_len);
+		*src = ip6h->saddr;
+		*dst = ip6h->daddr;
+	}
+
+	return true;
 }
 
 static __always_inline int get_dev_ifindex(int which)
@@ -99,15 +136,17 @@ SEC("chk_egress") int tc_chk(struct __sk_buff *skb)
 
 SEC("dst_ingress") int tc_dst(struct __sk_buff *skb)
 {
+	struct bpf_fib_lookup fib_params = { .ifindex = skb->ingress_ifindex };
 	__u8 zero[ETH_ALEN * 2];
 	bool redirect = false;
+	int ret;
 
 	switch (skb->protocol) {
 	case __bpf_constant_htons(ETH_P_IP):
-		redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_src));
+		redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_src), &fib_params);
 		break;
 	case __bpf_constant_htons(ETH_P_IPV6):
-		redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_src);
+		redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_src, &fib_params);
 		break;
 	}
 
@@ -118,7 +157,31 @@ SEC("dst_ingress") int tc_dst(struct __sk_buff *skb)
 	if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0)
 		return TC_ACT_SHOT;
 
-	return bpf_redirect_neigh(get_dev_ifindex(dev_src), 0);
+	ret = bpf_fib_lookup(skb, &fib_params, sizeof(fib_params), 0);
+	bpf_printk("bpf_fib_lookup() ret: %d\n", ret);
+	if (ret == BPF_FIB_LKUP_RET_SUCCESS) {
+		void *data_end = ctx_ptr(skb->data_end);
+		struct ethhdr *eth = ctx_ptr(skb->data);
+
+		if (eth + 1 > data_end)
+			return TC_ACT_SHOT;
+
+		__builtin_memcpy(eth->h_dest, fib_params.dmac, ETH_ALEN);
+		__builtin_memcpy(eth->h_source, fib_params.smac, ETH_ALEN);
+
+		return bpf_redirect(fib_params.ifindex, 0);
+
+	} else if (ret == BPF_FIB_LKUP_RET_NO_NEIGH) {
+		struct bpf_redir_neigh nh_params = {};
+
+		nh_params.nh_family = fib_params.family;
+		__builtin_memcpy(&nh_params.ipv6_nh, &fib_params.ipv6_dst,
+				 sizeof(nh_params.ipv6_nh));
+
+		return bpf_redirect_neigh(fib_params.ifindex, &nh_params,
+					  sizeof(nh_params), 0);
+	}
+	return TC_ACT_SHOT;
 }
 
 SEC("src_ingress") int tc_src(struct __sk_buff *skb)
@@ -128,10 +191,10 @@ SEC("src_ingress") int tc_src(struct __sk_buff *skb)
 
 	switch (skb->protocol) {
 	case __bpf_constant_htons(ETH_P_IP):
-		redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_dst));
+		redirect = is_remote_ep_v4(skb, __bpf_constant_htonl(ip4_dst), NULL);
 		break;
 	case __bpf_constant_htons(ETH_P_IPV6):
-		redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_dst);
+		redirect = is_remote_ep_v6(skb, (struct in6_addr)ip6_dst, NULL);
 		break;
 	}
 
@@ -142,7 +205,7 @@ SEC("src_ingress") int tc_src(struct __sk_buff *skb)
 	if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0)
 		return TC_ACT_SHOT;
 
-	return bpf_redirect_neigh(get_dev_ifindex(dev_dst), 0);
+	return bpf_redirect_neigh(get_dev_ifindex(dev_dst), NULL, 0, 0);
 }
 
 char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_tc_redirect.sh b/tools/testing/selftests/bpf/test_tc_redirect.sh
index 6d7482562140..09b20f24d018 100755
--- a/tools/testing/selftests/bpf/test_tc_redirect.sh
+++ b/tools/testing/selftests/bpf/test_tc_redirect.sh
@@ -24,8 +24,7 @@ command -v timeout >/dev/null 2>&1 || \
 	{ echo >&2 "timeout is not available"; exit 1; }
 command -v ping >/dev/null 2>&1 || \
 	{ echo >&2 "ping is not available"; exit 1; }
-command -v ping6 >/dev/null 2>&1 || \
-	{ echo >&2 "ping6 is not available"; exit 1; }
+if command -v ping6 >/dev/null 2>&1; then PING6=ping6; else PING6=ping; fi
 command -v perl >/dev/null 2>&1 || \
 	{ echo >&2 "perl is not available"; exit 1; }
 command -v jq >/dev/null 2>&1 || \
@@ -152,7 +151,7 @@ netns_test_connectivity()
 	echo -e "${TEST}: ${GREEN}PASS${NC}"
 
 	TEST="ICMPv6 connectivity test"
-	ip netns exec ${NS_SRC} ping6 $PING_ARG ${IP6_DST}
+	ip netns exec ${NS_SRC} $PING6 $PING_ARG ${IP6_DST}
 	if [ $? -ne 0 ]; then
 		echo -e "${TEST}: ${RED}FAIL${NC}"
 		exit 1
@@ -179,6 +178,9 @@ netns_setup_bpf()
 	ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd ingress bpf da obj $obj sec dst_ingress
 	ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd egress  bpf da obj $obj sec chk_egress
 
+	# bpf_fib_lookup() checks if forwarding is enabled
+	ip netns exec ${NS_FWD} sysctl -w net.ipv4.ip_forward=1 net.ipv6.conf.veth_dst_fwd.forwarding=1
+
 	veth_src=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_src_fwd/ifindex)
 	veth_dst=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_dst_fwd/ifindex)
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh()
  2020-10-15 15:46 ` [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh() Toke Høiland-Jørgensen
@ 2020-10-19 14:40   ` Daniel Borkmann
  2020-10-19 14:48     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2020-10-19 14:40 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: David Ahern, netdev, bpf

On 10/15/20 5:46 PM, Toke Høiland-Jørgensen wrote:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> 
> This updates the test_tc_neigh selftest to use the new syntax of
> bpf_redirect_neigh(). To exercise the helper both with and without the
> optional parameter, one forwarding direction is changed to do a
> bpf_fib_lookup() followed by a call to bpf_redirect_neigh(), while the
> other direction is using the map-based ifindex lookup letting the redirect
> helper resolve the nexthop from the FIB.
> 
> This also fixes the test_tc_redirect.sh script to work on systems that have
> a consolidated dual-stack 'ping' binary instead of separate ping/ping6
> versions.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>

I would prefer if you could not mix the two tests, meaning, one complete test
case is only with bpf_redirect_neigh(get_dev_ifindex(xxx), NULL, 0, 0) for both
directions, and another self-contained one is with fib lookup + bpf_redirect_neigh
with params, even if it means we duplicate test_tc_neigh.c slighly, but I think
that's fine for sake of test coverage.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh()
  2020-10-19 14:40   ` Daniel Borkmann
@ 2020-10-19 14:48     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 15+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-19 14:48 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Ahern, netdev, bpf

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 10/15/20 5:46 PM, Toke Høiland-Jørgensen wrote:
>> From: Toke Høiland-Jørgensen <toke@redhat.com>
>> 
>> This updates the test_tc_neigh selftest to use the new syntax of
>> bpf_redirect_neigh(). To exercise the helper both with and without the
>> optional parameter, one forwarding direction is changed to do a
>> bpf_fib_lookup() followed by a call to bpf_redirect_neigh(), while the
>> other direction is using the map-based ifindex lookup letting the redirect
>> helper resolve the nexthop from the FIB.
>> 
>> This also fixes the test_tc_redirect.sh script to work on systems that have
>> a consolidated dual-stack 'ping' binary instead of separate ping/ping6
>> versions.
>> 
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>
> I would prefer if you could not mix the two tests, meaning, one complete test
> case is only with bpf_redirect_neigh(get_dev_ifindex(xxx), NULL, 0, 0) for both
> directions, and another self-contained one is with fib lookup + bpf_redirect_neigh
> with params, even if it means we duplicate test_tc_neigh.c slighly, but I think
> that's fine for sake of test coverage.

Sure, can do :)

-Toke


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-10-19 15:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-15 15:46 [PATCH RFC bpf-next 0/2] bpf: Rework bpf_redirect_neigh() to allow supplying nexthop from caller Toke Høiland-Jørgensen
2020-10-15 15:46 ` [PATCH RFC bpf-next 1/2] bpf_redirect_neigh: Support supplying the nexthop as a helper parameter Toke Høiland-Jørgensen
2020-10-15 16:27   ` David Ahern
2020-10-15 19:34     ` Toke Høiland-Jørgensen
2020-10-19 13:09       ` Daniel Borkmann
2020-10-19 13:28         ` Toke Høiland-Jørgensen
2020-10-15 18:17   ` kernel test robot
2020-10-15 20:59   ` kernel test robot
2020-10-15 21:27   ` kernel test robot
2020-10-19 14:48   ` Daniel Borkmann
2020-10-19 14:56     ` Toke Høiland-Jørgensen
2020-10-19 15:01   ` Daniel Borkmann
2020-10-15 15:46 ` [PATCH RFC bpf-next 2/2] selftests: Update test_tc_neigh to use the modified bpf_redirect_neigh() Toke Høiland-Jørgensen
2020-10-19 14:40   ` Daniel Borkmann
2020-10-19 14:48     ` Toke Høiland-Jørgensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.