netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect
@ 2023-07-25  4:12 Yan Zhai
  2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Yan Zhai @ 2023-07-25  4:12 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, Yan Zhai, linux-kernel, netdev, linux-kselftest,
	kernel-team, Jordan Griege

lwt xmit hook does not expect positive return values in function
ip_finish_output2 and ip6_finish_output2. However, BPF redirect programs
can return positive values such like NET_XMIT_DROP, NET_RX_DROP, and etc
as errors. Such return values can panic the kernel unexpectedly:

https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48

This patch fixes the return values from BPF redirect, so the error
handling would be consistent at xmit hook. It also adds a few test cases
to prevent future regressions.

v2: https://lore.kernel.org/netdev/ZLdY6JkWRccunvu0@debian.debian/ 
v1: https://lore.kernel.org/bpf/ZLbYdpWC8zt9EJtq@debian.debian/

changes since v2:
  * subject name changed
  * also covered redirect to ingress case
  * added selftests

changes since v1:
  * minor code style changes

Yan Zhai (2):
  bpf: fix skb_do_redirect return values
  bpf: selftests: add lwt redirect regression test cases

 net/core/filter.c                             |  12 +-
 tools/testing/selftests/bpf/Makefile          |   1 +
 .../selftests/bpf/progs/test_lwt_redirect.c   |  67 +++++++
 .../selftests/bpf/test_lwt_redirect.sh        | 165 ++++++++++++++++++
 4 files changed, 244 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
 create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh

-- 
2.30.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25  4:12 [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Yan Zhai
@ 2023-07-25  4:13 ` Yan Zhai
  2023-07-25  5:10   ` Markus Elfring
  2023-07-25  9:08   ` Jakub Sitnicki
  2023-07-25  4:14 ` [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling Yan Zhai
  2023-07-25 13:20 ` [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Jakub Sitnicki
  2 siblings, 2 replies; 10+ messages in thread
From: Yan Zhai @ 2023-07-25  4:13 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, Yan Zhai, linux-kernel, netdev, linux-kselftest,
	kernel-team, Jordan Griege

skb_do_redirect returns various of values: error code (negative), 0
(success), and some positive status code, e.g. NET_XMIT_CN, NET_RX_DROP.
Such code are not handled at lwt xmit hook in function ip_finish_output2
and ip6_finish_output, which can cause unexpected problems. This change
converts the positive status code to proper error code.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Reported-by: Jordan Griege <jgriege@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>

---
v3: converts also RX side return value in addition to TX values
v2: code style change suggested by Stanislav Fomichev
---
 net/core/filter.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 06ba0e56e369..3e232ce11ca0 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2095,7 +2095,12 @@ static const struct bpf_func_proto bpf_csum_level_proto = {
 
 static inline int __bpf_rx_skb(struct net_device *dev, struct sk_buff *skb)
 {
-	return dev_forward_skb_nomtu(dev, skb);
+	int ret = dev_forward_skb_nomtu(dev, skb);
+
+	if (unlikely(ret > 0))
+		return -ENETDOWN;
+
+	return 0;
 }
 
 static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
@@ -2106,6 +2111,8 @@ static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
 	if (likely(!ret)) {
 		skb->dev = dev;
 		ret = netif_rx(skb);
+	} else if (ret > 0) {
+		return -ENETDOWN;
 	}
 
 	return ret;
@@ -2129,6 +2136,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 	ret = dev_queue_xmit(skb);
 	dev_xmit_recursion_dec();
 
+	if (unlikely(ret > 0))
+		ret = net_xmit_errno(ret);
+
 	return ret;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling
  2023-07-25  4:12 [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Yan Zhai
  2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
@ 2023-07-25  4:14 ` Yan Zhai
  2023-07-25 13:18   ` Jakub Sitnicki
  2023-07-25 13:20 ` [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Jakub Sitnicki
  2 siblings, 1 reply; 10+ messages in thread
From: Yan Zhai @ 2023-07-25  4:14 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, Yan Zhai, linux-kernel, netdev, linux-kselftest,
	kernel-team, Jordan Griege

Tests BPF redirect at the lwt xmit hook to ensure error handling are
safe, i.e. won't panic the kernel.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
 tools/testing/selftests/bpf/Makefile          |   1 +
 .../selftests/bpf/progs/test_lwt_redirect.c   |  67 +++++++
 .../selftests/bpf/test_lwt_redirect.sh        | 165 ++++++++++++++++++
 3 files changed, 233 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
 create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 538df8fb8c42..e3a24d053793 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -66,6 +66,7 @@ TEST_PROGS := test_kmod.sh \
 	test_xdp_vlan_mode_generic.sh \
 	test_xdp_vlan_mode_native.sh \
 	test_lwt_ip_encap.sh \
+	test_lwt_redirect.sh \
 	test_tcp_check_syncookie.sh \
 	test_tc_tunnel.sh \
 	test_tc_edt.sh \
diff --git a/tools/testing/selftests/bpf/progs/test_lwt_redirect.c b/tools/testing/selftests/bpf/progs/test_lwt_redirect.c
new file mode 100644
index 000000000000..622c6b1e7128
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_lwt_redirect.c
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+#define ETH_LEN 14
+
+/* We don't care about whether the packet can be received by network stack.
+ * Just care if the packet is sent to the correct device at correct direction
+ * and not panic the kernel.
+ */
+static __always_inline int prepend_dummy_mac(struct __sk_buff *skb)
+{
+	char mac[] = {0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0xf,
+		      0xe, 0xd, 0xc, 0xb, 0xa, 0x08, 0x00};
+
+	if (bpf_skb_change_head(skb, ETH_LEN, 0)) {
+		bpf_printk("%s: fail to change head", __func__);
+		return -1;
+	}
+
+	if (bpf_skb_store_bytes(skb, 0, mac, sizeof(mac), 0)) {
+		bpf_printk("%s: fail to update mac", __func__);
+		return -1;
+	}
+
+	return 0;
+}
+
+SEC("redir_ingress")
+int test_lwt_redirect_in(struct __sk_buff *skb)
+{
+	if (prepend_dummy_mac(skb))
+		return BPF_DROP;
+
+	bpf_printk("Redirect skb to link %d ingress", skb->mark);
+	return bpf_redirect(skb->mark, BPF_F_INGRESS);
+}
+
+SEC("redir_egress")
+int test_lwt_redirect_out(struct __sk_buff *skb)
+{
+	if (prepend_dummy_mac(skb))
+		return BPF_DROP;
+
+	bpf_printk("Redirect skb to link %d egress", skb->mark);
+	return bpf_redirect(skb->mark, 0);
+}
+
+SEC("redir_egress_nomac")
+int test_lwt_redirect_out_nomac(struct __sk_buff *skb)
+{
+	int ret = bpf_redirect(skb->mark, 0);
+
+	bpf_printk("Redirect skb to link %d egress nomac: %d", skb->mark, ret);
+	return ret;
+}
+
+SEC("redir_ingress_nomac")
+int test_lwt_redirect_in_nomac(struct __sk_buff *skb)
+{
+	int ret = bpf_redirect(skb->mark, BPF_F_INGRESS);
+
+	bpf_printk("Redirect skb to link %d ingress nomac: %d", skb->mark, ret);
+	return ret;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_lwt_redirect.sh b/tools/testing/selftests/bpf/test_lwt_redirect.sh
new file mode 100755
index 000000000000..fe97cbc40ee8
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_lwt_redirect.sh
@@ -0,0 +1,165 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# This regression test checks basic lwt redirect functionality,
+# making sure the kernel would not crash when redirecting packets
+# to a device, regardless its administration state:
+#
+# 1. redirect to a device egress/ingress should work normally
+# 2. redirect to a device egress/ingress should not panic when target is down
+# 3. redirect to a device egress/ingress should not panic when target carrier is down
+#
+# All test setup are simple: redirect ping packet via lwt xmit to cover above
+# situations. We do not worry about specific device type, except for the two
+# categories of devices that require MAC header and not require MAC header. For
+# carrier down situation, we use a vlan device as upper link, and bring down its
+# lower device.
+#
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+BPF_FILE="test_lwt_redirect.bpf.o"
+INGRESS_REDIR_IP=2.2.2.2
+EGRESS_REDIR_IP=3.3.3.3
+INGRESS_REDIR_IP_NOMAC=4.4.4.4
+EGRESS_REDIR_IP_NOMAC=5.5.5.5
+PASS=0
+FAIL=0
+
+readonly NS1="ns1-$(mktemp -u XXXXXX)"
+
+msg="skip all tests:"
+if [ $UID != 0 ]; then
+	echo $msg please run this as root >&2
+	exit $ksft_skip
+fi
+
+get_ip_direction() {
+	case $1 in
+		$INGRESS_REDIR_IP|$INGRESS_REDIR_IP_NOMAC)
+			echo ingress
+			;;
+		$EGRESS_REDIR_IP|$EGRESS_REDIR_IP_NOMAC)
+			echo egress
+			;;
+		*)
+			echo bug
+			;;
+	esac
+}
+
+test_pass()
+{
+	local testname=$1
+	local direction=`get_ip_direction $2`
+	shift 2
+	echo "Pass: $testname $direction $@"
+	PASS=$((PASS + 1))
+}
+
+test_fail()
+{
+	local testname=$1
+	local direction=`get_ip_direction $2`
+	shift 2
+	echo "Fail: $testname $direction $@"
+	FAIL=$((FAIL + 1))
+}
+
+setup() {
+	ip netns add $NS1
+
+	ip -n $NS1 link set lo up
+	ip -n $NS1 link add link_err type dummy
+	ip -n $NS1 link add link_w_mac type dummy
+	ip -n $NS1 link add link link_w_mac link_upper type vlan id 1
+	ip -n $NS1 link add link_wo_mac type gre remote 4.3.2.1 local 1.2.3.4
+	ip -n $NS1 link set link_err up
+	ip -n $NS1 link set link_w_mac up
+	ip -n $NS1 link set link_upper up
+	ip -n $NS1 link set link_wo_mac up
+
+	ip -n $NS1 addr add dev lo 1.1.1.1/32
+	ip -n $NS1 route add $INGRESS_REDIR_IP encap bpf xmit \
+		obj $BPF_FILE sec redir_ingress dev link_err
+	ip -n $NS1 route add $EGRESS_REDIR_IP encap bpf xmit \
+		obj $BPF_FILE sec redir_egress dev link_err
+	ip -n $NS1 route add $INGRESS_REDIR_IP_NOMAC encap bpf xmit \
+		obj $BPF_FILE sec redir_ingress_nomac dev link_err
+	ip -n $NS1 route add $EGRESS_REDIR_IP_NOMAC encap bpf xmit \
+		obj $BPF_FILE sec redir_egress_nomac dev link_err
+}
+
+cleanup_and_summary() {
+	ip netns del $NS1
+	echo PASSED:$PASS FAILED:$FAIL
+	if [ $FAIL -ne 0 ]; then
+		exit 1
+	else
+		exit 0
+	fi
+}
+
+test_redirect_normal() {
+	local test_name=${FUNCNAME[0]}
+	local link_name=$1
+	local link_id=`ip netns exec $NS1 cat /sys/class/net/${link_name}/ifindex`
+	local dest=$2
+
+	ip netns exec $NS1 timeout 2 tcpdump -i ${link_name} -c 1 -n -p icmp >/dev/null 2>&1 &
+	local jobid=$!
+	sleep 1
+
+	# hack: mark indicates the link to redirect to
+	ip netns exec $NS1 ping -m $link_id $dest -c 1 -w 1  > /dev/null 2>&1
+	wait $jobid
+
+	if [ $? -ne 0 ]; then
+		test_fail $test_name $dest $link_name
+	else
+		test_pass $test_name $dest $link_name
+	fi
+}
+
+test_redirect_no_panic_on_link_down() {
+	local test_name=${FUNCNAME[0]}
+	local link_name=$1
+	local link_id=`ip netns exec $NS1 cat /sys/class/net/${link_name}/ifindex`
+	local dest=$2
+
+	ip -n $NS1 link set $link_name down
+	# hack: mark indicates the link to redirect to
+	ip netns exec $NS1 ping -m $link_id $dest -c 1 -w 1 >/dev/null 2>&1
+
+	test_pass $test_name $dest to $link_name
+	ip -n $NS1 link set $link_name up
+}
+
+test_redirect_no_panic_on_link_carrier_down() {
+	local test_name=${FUNCNAME[0]}
+	local link_id=`ip netns exec $NS1 cat /sys/class/net/link_upper/ifindex`
+	local dest=$1
+
+	ip -n $NS1 link set link_w_mac down
+	# hack: mark indicates the link to redirect to
+	ip netns exec $NS1 ping -m $link_id $dest -c 1 -w 1 >/dev/null 2>&1
+
+	test_pass $test_name $dest to link_upper
+	ip -n $NS1 link set link_w_mac up
+}
+
+setup
+
+echo "Testing lwt redirect to devices requiring MAC header"
+for dest in $INGRESS_REDIR_IP $EGRESS_REDIR_IP; do
+	test_redirect_normal link_w_mac $dest
+	test_redirect_no_panic_on_link_down link_w_mac $dest
+	test_redirect_no_panic_on_link_carrier_down $dest
+done
+
+echo "Testing lwt redirect to devices not requiring MAC header"
+for dest in $INGRESS_REDIR_IP_NOMAC $EGRESS_REDIR_IP_NOMAC; do
+	test_redirect_normal link_wo_mac $dest
+	test_redirect_no_panic_on_link_down link_wo_mac $dest
+done
+
+cleanup_and_summary
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
@ 2023-07-25  5:10   ` Markus Elfring
  2023-07-25 14:40     ` Yan Zhai
  2023-07-25  9:08   ` Jakub Sitnicki
  1 sibling, 1 reply; 10+ messages in thread
From: Markus Elfring @ 2023-07-25  5:10 UTC (permalink / raw)
  To: Yan Zhai, kernel-team, bpf, linux-kselftest, netdev,
	kernel-janitors, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, David S. Miller, Eric Dumazet, Hao Luo,
	Jakub Kicinski, Jiri Olsa, John Fastabend, KP Singh,
	Martin KaFai Lau, Mykola Lysenko, Paolo Abeni, Shuah Khan,
	Song Liu, Yonghong Song
  Cc: LKML, Jordan Griege, Stanislav Fomichev

>                                      … unexpected problems. This change
> converts the positive status code to proper error code.

Please choose a corresponding imperative change suggestion.

See also:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.5-rc3#n94


Did you provide sufficient justification for a possible addition of the tag “Fixes”?


…
> v2: code style change suggested by Stanislav Fomichev
> ---
>  net/core/filter.c | 12 +++++++++++-
…

How do you think about to replace this marker by a line break?

See also:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.5-rc3#n711

Regards,
Markus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
  2023-07-25  5:10   ` Markus Elfring
@ 2023-07-25  9:08   ` Jakub Sitnicki
  2023-07-25 16:05     ` Yan Zhai
  1 sibling, 1 reply; 10+ messages in thread
From: Jakub Sitnicki @ 2023-07-25  9:08 UTC (permalink / raw)
  To: Yan Zhai
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, linux-kernel, netdev, linux-kselftest, Jordan Griege,
	kernel-team

On Mon, Jul 24, 2023 at 09:13 PM -07, Yan Zhai wrote:
> skb_do_redirect returns various of values: error code (negative), 0
> (success), and some positive status code, e.g. NET_XMIT_CN, NET_RX_DROP.
> Such code are not handled at lwt xmit hook in function ip_finish_output2
> and ip6_finish_output, which can cause unexpected problems. This change
> converts the positive status code to proper error code.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Reported-by: Jordan Griege <jgriege@cloudflare.com>
> Signed-off-by: Yan Zhai <yan@cloudflare.com>
>
> ---
> v3: converts also RX side return value in addition to TX values
> v2: code style change suggested by Stanislav Fomichev
> ---
>  net/core/filter.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 06ba0e56e369..3e232ce11ca0 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2095,7 +2095,12 @@ static const struct bpf_func_proto bpf_csum_level_proto = {
>  
>  static inline int __bpf_rx_skb(struct net_device *dev, struct sk_buff *skb)
>  {
> -	return dev_forward_skb_nomtu(dev, skb);
> +	int ret = dev_forward_skb_nomtu(dev, skb);
> +
> +	if (unlikely(ret > 0))
> +		return -ENETDOWN;
> +
> +	return 0;
>  }
>  
>  static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
> @@ -2106,6 +2111,8 @@ static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
>  	if (likely(!ret)) {
>  		skb->dev = dev;
>  		ret = netif_rx(skb);
> +	} else if (ret > 0) {
> +		return -ENETDOWN;
>  	}
>  
>  	return ret;
> @@ -2129,6 +2136,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
>  	ret = dev_queue_xmit(skb);
>  	dev_xmit_recursion_dec();
>  
> +	if (unlikely(ret > 0))
> +		ret = net_xmit_errno(ret);
> +
>  	return ret;
>  }

net_xmit_errno maps NET_XMIT_DROP to -ENOBUFS. It would make sense to me
to map NET_RX_DROP to -ENOBUFS as well, instead of -ENETDOWN, to be
consistent.

It looks like the Fixes tag for this should point to the change that
introduced BPF for LWT:

Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling
  2023-07-25  4:14 ` [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling Yan Zhai
@ 2023-07-25 13:18   ` Jakub Sitnicki
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Sitnicki @ 2023-07-25 13:18 UTC (permalink / raw)
  To: Yan Zhai
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, linux-kernel, netdev, linux-kselftest, Jordan Griege,
	kernel-team

On Mon, Jul 24, 2023 at 09:14 PM -07, Yan Zhai wrote:
> Tests BPF redirect at the lwt xmit hook to ensure error handling are
> safe, i.e. won't panic the kernel.
>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Yan Zhai <yan@cloudflare.com>
> ---
>  tools/testing/selftests/bpf/Makefile          |   1 +
>  .../selftests/bpf/progs/test_lwt_redirect.c   |  67 +++++++
>  .../selftests/bpf/test_lwt_redirect.sh        | 165 ++++++++++++++++++
>  3 files changed, 233 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
>  create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 538df8fb8c42..e3a24d053793 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -66,6 +66,7 @@ TEST_PROGS := test_kmod.sh \
>  	test_xdp_vlan_mode_generic.sh \
>  	test_xdp_vlan_mode_native.sh \
>  	test_lwt_ip_encap.sh \
> +	test_lwt_redirect.sh \
>  	test_tcp_check_syncookie.sh \
>  	test_tc_tunnel.sh \
>  	test_tc_edt.sh \
> diff --git a/tools/testing/selftests/bpf/progs/test_lwt_redirect.c b/tools/testing/selftests/bpf/progs/test_lwt_redirect.c
> new file mode 100644
> index 000000000000..622c6b1e7128
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_lwt_redirect.c
> @@ -0,0 +1,67 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +
> +#define ETH_LEN 14

Nit: We have ETH_HLEN in bpf_tracing_net.h defined.

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect
  2023-07-25  4:12 [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Yan Zhai
  2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
  2023-07-25  4:14 ` [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling Yan Zhai
@ 2023-07-25 13:20 ` Jakub Sitnicki
  2 siblings, 0 replies; 10+ messages in thread
From: Jakub Sitnicki @ 2023-07-25 13:20 UTC (permalink / raw)
  To: Yan Zhai
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, linux-kernel, netdev, linux-kselftest, Jordan Griege,
	kernel-team

On Mon, Jul 24, 2023 at 09:12 PM -07, Yan Zhai wrote:
> lwt xmit hook does not expect positive return values in function
> ip_finish_output2 and ip6_finish_output2. However, BPF redirect programs
> can return positive values such like NET_XMIT_DROP, NET_RX_DROP, and etc
> as errors. Such return values can panic the kernel unexpectedly:
>
> https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
>
> This patch fixes the return values from BPF redirect, so the error
> handling would be consistent at xmit hook. It also adds a few test cases
> to prevent future regressions.
>
> v2: https://lore.kernel.org/netdev/ZLdY6JkWRccunvu0@debian.debian/ 
> v1: https://lore.kernel.org/bpf/ZLbYdpWC8zt9EJtq@debian.debian/
>
> changes since v2:
>   * subject name changed
>   * also covered redirect to ingress case
>   * added selftests
>
> changes since v1:
>   * minor code style changes
>
> Yan Zhai (2):
>   bpf: fix skb_do_redirect return values
>   bpf: selftests: add lwt redirect regression test cases
>
>  net/core/filter.c                             |  12 +-
>  tools/testing/selftests/bpf/Makefile          |   1 +
>  .../selftests/bpf/progs/test_lwt_redirect.c   |  67 +++++++
>  .../selftests/bpf/test_lwt_redirect.sh        | 165 ++++++++++++++++++
>  4 files changed, 244 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/progs/test_lwt_redirect.c
>  create mode 100755 tools/testing/selftests/bpf/test_lwt_redirect.sh

For the series:

Tested-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25  5:10   ` Markus Elfring
@ 2023-07-25 14:40     ` Yan Zhai
  0 siblings, 0 replies; 10+ messages in thread
From: Yan Zhai @ 2023-07-25 14:40 UTC (permalink / raw)
  To: Markus Elfring
  Cc: kernel-team, bpf, linux-kselftest, netdev, kernel-janitors,
	Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	David S. Miller, Eric Dumazet, Hao Luo, Jakub Kicinski, Jiri Olsa,
	John Fastabend, KP Singh, Martin KaFai Lau, Mykola Lysenko,
	Paolo Abeni, Shuah Khan, Song Liu, Yonghong Song, LKML,
	Jordan Griege, Stanislav Fomichev

On Tue, Jul 25, 2023 at 12:11 AM Markus Elfring <Markus.Elfring@web.de> wrote:
>
> >                                      … unexpected problems. This change
> > converts the positive status code to proper error code.
>
> Please choose a corresponding imperative change suggestion.
>
> See also:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.5-rc3#n94
>
>
> Did you provide sufficient justification for a possible addition of the tag “Fixes”?
>
>
> …
> > v2: code style change suggested by Stanislav Fomichev
> > ---
> >  net/core/filter.c | 12 +++++++++++-
> …
>
> How do you think about to replace this marker by a line break?
>
> See also:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.5-rc3#n711
>
> Regards,
> Markus

Hi Markus,

   Thanks for the suggestions, those are what I could use more help with.
   Will address these in the next version.

Yan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25  9:08   ` Jakub Sitnicki
@ 2023-07-25 16:05     ` Yan Zhai
  2023-07-25 17:07       ` Stanislav Fomichev
  0 siblings, 1 reply; 10+ messages in thread
From: Yan Zhai @ 2023-07-25 16:05 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, linux-kernel, netdev, linux-kselftest, Jordan Griege,
	kernel-team

On Tue, Jul 25, 2023 at 4:14 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Mon, Jul 24, 2023 at 09:13 PM -07, Yan Zhai wrote:
> > skb_do_redirect returns various of values: error code (negative), 0
> > (success), and some positive status code, e.g. NET_XMIT_CN, NET_RX_DROP.
> > Such code are not handled at lwt xmit hook in function ip_finish_output2
> > and ip6_finish_output, which can cause unexpected problems. This change
> > converts the positive status code to proper error code.
> >
> > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > Reported-by: Jordan Griege <jgriege@cloudflare.com>
> > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> >
> > ---
> > v3: converts also RX side return value in addition to TX values
> > v2: code style change suggested by Stanislav Fomichev
> > ---
> >  net/core/filter.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 06ba0e56e369..3e232ce11ca0 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -2095,7 +2095,12 @@ static const struct bpf_func_proto bpf_csum_level_proto = {
> >
> >  static inline int __bpf_rx_skb(struct net_device *dev, struct sk_buff *skb)
> >  {
> > -     return dev_forward_skb_nomtu(dev, skb);
> > +     int ret = dev_forward_skb_nomtu(dev, skb);
> > +
> > +     if (unlikely(ret > 0))
> > +             return -ENETDOWN;
> > +
> > +     return 0;
> >  }
> >
> >  static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
> > @@ -2106,6 +2111,8 @@ static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
> >       if (likely(!ret)) {
> >               skb->dev = dev;
> >               ret = netif_rx(skb);
> > +     } else if (ret > 0) {
> > +             return -ENETDOWN;
> >       }
> >
> >       return ret;
> > @@ -2129,6 +2136,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
> >       ret = dev_queue_xmit(skb);
> >       dev_xmit_recursion_dec();
> >
> > +     if (unlikely(ret > 0))
> > +             ret = net_xmit_errno(ret);
> > +
> >       return ret;
> >  }
>
> net_xmit_errno maps NET_XMIT_DROP to -ENOBUFS. It would make sense to me
> to map NET_RX_DROP to -ENOBUFS as well, instead of -ENETDOWN, to be
> consistent.
>
In fact I looked at all those errno, but found none actually covers
this situation completely. For the redirect ingress case, there are
three reasons to fail: backlog full, dev down, and MTU issue. This
won't be a problem for typical RX paths, since the error code is
usually discarded by call sites of deliver_skb. But redirect ingress
opens a call chain that would propagate this error to local sendmsg,
which may be very confusing to troubleshoot in a complex environment
(especially when backlog fills).

That said I agree ENOBUF covers the most likely reason to fail
(backlog). Let me change to that one in the next version if there are
no new suggestions.

> It looks like the Fixes tag for this should point to the change that
> introduced BPF for LWT:
>
> Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
>
Thanks for finding the tag. I was debating if it should be LWT commit
or bpf_redirect commit: the error is not handled at LWT, but it seems
actually innocent. The actual fix is the return value from the bpf
redirect code. Let me incorporate both in the next one to justify
better.

--
Yan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values
  2023-07-25 16:05     ` Yan Zhai
@ 2023-07-25 17:07       ` Stanislav Fomichev
  0 siblings, 0 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2023-07-25 17:07 UTC (permalink / raw)
  To: Yan Zhai
  Cc: Jakub Sitnicki, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, linux-kernel, netdev, linux-kselftest, Jordan Griege,
	kernel-team

On 07/25, Yan Zhai wrote:
> On Tue, Jul 25, 2023 at 4:14 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
> >
> > On Mon, Jul 24, 2023 at 09:13 PM -07, Yan Zhai wrote:
> > > skb_do_redirect returns various of values: error code (negative), 0
> > > (success), and some positive status code, e.g. NET_XMIT_CN, NET_RX_DROP.
> > > Such code are not handled at lwt xmit hook in function ip_finish_output2
> > > and ip6_finish_output, which can cause unexpected problems. This change
> > > converts the positive status code to proper error code.
> > >
> > > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > > Reported-by: Jordan Griege <jgriege@cloudflare.com>
> > > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> > >
> > > ---
> > > v3: converts also RX side return value in addition to TX values
> > > v2: code style change suggested by Stanislav Fomichev
> > > ---
> > >  net/core/filter.c | 12 +++++++++++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index 06ba0e56e369..3e232ce11ca0 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -2095,7 +2095,12 @@ static const struct bpf_func_proto bpf_csum_level_proto = {
> > >
> > >  static inline int __bpf_rx_skb(struct net_device *dev, struct sk_buff *skb)
> > >  {
> > > -     return dev_forward_skb_nomtu(dev, skb);
> > > +     int ret = dev_forward_skb_nomtu(dev, skb);
> > > +
> > > +     if (unlikely(ret > 0))
> > > +             return -ENETDOWN;
> > > +
> > > +     return 0;
> > >  }
> > >
> > >  static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
> > > @@ -2106,6 +2111,8 @@ static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
> > >       if (likely(!ret)) {
> > >               skb->dev = dev;
> > >               ret = netif_rx(skb);
> > > +     } else if (ret > 0) {
> > > +             return -ENETDOWN;
> > >       }
> > >
> > >       return ret;
> > > @@ -2129,6 +2136,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
> > >       ret = dev_queue_xmit(skb);
> > >       dev_xmit_recursion_dec();
> > >
> > > +     if (unlikely(ret > 0))
> > > +             ret = net_xmit_errno(ret);
> > > +
> > >       return ret;
> > >  }
> >
> > net_xmit_errno maps NET_XMIT_DROP to -ENOBUFS. It would make sense to me
> > to map NET_RX_DROP to -ENOBUFS as well, instead of -ENETDOWN, to be
> > consistent.
> >
> In fact I looked at all those errno, but found none actually covers
> this situation completely. For the redirect ingress case, there are
> three reasons to fail: backlog full, dev down, and MTU issue. This
> won't be a problem for typical RX paths, since the error code is
> usually discarded by call sites of deliver_skb. But redirect ingress
> opens a call chain that would propagate this error to local sendmsg,
> which may be very confusing to troubleshoot in a complex environment
> (especially when backlog fills).
> 
> That said I agree ENOBUF covers the most likely reason to fail
> (backlog). Let me change to that one in the next version if there are
> no new suggestions.

nit: also maybe wrap these rx paths into some new net_rx_errno ?
To mirror the tx side.
 
> > It looks like the Fixes tag for this should point to the change that
> > introduced BPF for LWT:
> >
> > Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
> >
> Thanks for finding the tag. I was debating if it should be LWT commit
> or bpf_redirect commit: the error is not handled at LWT, but it seems
> actually innocent. The actual fix is the return value from the bpf
> redirect code. Let me incorporate both in the next one to justify
> better.
> 
> --
> Yan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-07-25 17:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-25  4:12 [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Yan Zhai
2023-07-25  4:13 ` [PATCH v3 bpf 1/2] bpf: fix skb_do_redirect return values Yan Zhai
2023-07-25  5:10   ` Markus Elfring
2023-07-25 14:40     ` Yan Zhai
2023-07-25  9:08   ` Jakub Sitnicki
2023-07-25 16:05     ` Yan Zhai
2023-07-25 17:07       ` Stanislav Fomichev
2023-07-25  4:14 ` [PATCH v3 bpf 2/2] selftests/bpf: test lwt redirect error handling Yan Zhai
2023-07-25 13:18   ` Jakub Sitnicki
2023-07-25 13:20 ` [PATCH v3 bpf 0/2] bpf: return proper error codes for lwt redirect Jakub Sitnicki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).