* [PATCH bpf 0/2] bpf: fix wrong copied_seq calculation and add tests
@ 2024-12-05 12:02 Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 1/2] bpf: fix wrong copied_seq calculation Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 2/2] selftests/bpf: add strparser test for bpf Jiayuan Chen
0 siblings, 2 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-12-05 12:02 UTC (permalink / raw)
To: bpf
Cc: martin.lau, ast, edumazet, jakub, davem, dsahern, kuba, pabeni,
netdev, linux-kernel, song, john.fastabend, andrii, mhal,
yonghong.song, daniel, xiyou.wangcong, Jiayuan Chen
bpf: fix wrong copied_seq calculation and add tests
A previous commit described in this topic
https://lore.kernel.org/bpf/20230523025618.113937-9-john.fastabend@gmail.com
directly updated 'sk->copied_seq' in the tcp_eat_skb() function when the
action of a BPF program was SK_REDIRECT. For other actions, like SK_PASS,
the update logic for 'sk->copied_seq' was moved to
tcp_bpf_recvmsg_parser() to ensure the accuracy of the 'fionread' feature.
That commit works for a single stream_verdict scenario, as it also
modified 'sk_data_ready->sk_psock_verdict_data_ready->tcp_read_skb'
to remove updating 'sk->copied_seq'.
However, for programs where both stream_parser and stream_verdict are
active(strparser purpose), tcp_read_sock() was used instead of
tcp_read_skb() (sk_data_ready->strp_data_ready->tcp_read_sock)
tcp_read_sock() now still update 'sk->copied_seq', leading to duplicated
updates.
In summary, for strparser + SK_PASS, copied_seq is redundantly calculated
in both tcp_read_sock() and tcp_bpf_recvmsg_parser().
The issue causes incorrect copied_seq calculations, which prevent
correct data reads from the recv() interface in user-land.
Modifying tcp_read_sock() or strparser implementation directly is
unreasonable, as it is widely used in other modules.
Here, we introduce a method tcp_bpf_read_sock() to replace
'sk->sk_socket->ops->read_sock' (like 'tls_build_proto()' does in
tls_main.c). Such replacement action was also used in updating
tcp_bpf_prots in tcp_bpf.c, so it's not weird.
(Note that checkpatch.pl may complain missing 'const' qualifier when we
define the bpf-specified 'proto_ops', but we have to do because we need
update it).
Also we remove strparser check in tcp_eat_skb() since we implement custom
function tcp_bpf_read_sock() without copied_seq updating.
Since strparser currently supports only TCP, it's sufficient for 'ops' to
inherit inet_stream_ops.
In strparser's implementation, regardless of partial or full reads,
it completely clones the entire skb, allowing us to unconditionally
free skb in tcp_bpf_read_sock().
We added test cases for bpf + strparser and separated them from
sockmap_basic. This is because we need to add more test cases for
strparser in the future.
---
Jiayuan Chen (2):
bpf: fix wrong copied_seq calculation
selftests/bpf: add strparser test for bpf
include/linux/skmsg.h | 1 +
include/net/tcp.h | 1 +
net/core/skmsg.c | 3 +
net/ipv4/tcp.c | 2 +-
net/ipv4/tcp_bpf.c | 77 +++++-
.../selftests/bpf/prog_tests/sockmap_basic.c | 53 ----
.../selftests/bpf/prog_tests/sockmap_strp.c | 255 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_strp.c | 51 ++++
8 files changed, 386 insertions(+), 57 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/sockmap_strp.c
create mode 100644 tools/testing/selftests/bpf/progs/test_sockmap_strp.c
base-commit: 5a6ea7022ff4d2a65ae328619c586d6a8909b48b
--
2.43.5
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH bpf 1/2] bpf: fix wrong copied_seq calculation
2024-12-05 12:02 [PATCH bpf 0/2] bpf: fix wrong copied_seq calculation and add tests Jiayuan Chen
@ 2024-12-05 12:02 ` Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 2/2] selftests/bpf: add strparser test for bpf Jiayuan Chen
1 sibling, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-12-05 12:02 UTC (permalink / raw)
To: bpf
Cc: martin.lau, ast, edumazet, jakub, davem, dsahern, kuba, pabeni,
netdev, linux-kernel, song, john.fastabend, andrii, mhal,
yonghong.song, daniel, xiyou.wangcong, Jiayuan Chen
'sk->copied_seq' was updated in the tcp_eat_skb() function when the
action of a BPF program was SK_REDIRECT. For other actions, like SK_PASS,
the update logic for 'sk->copied_seq' was moved to
tcp_bpf_recvmsg_parser() to ensure the accuracy of the 'fionread' feature.
It works for a single stream_verdict scenario, as it also modified
'sk_data_ready->sk_psock_verdict_data_ready->tcp_read_skb'
to remove updating 'sk->copied_seq'.
However, for programs where both stream_parser and stream_verdict are
active(strparser purpose), tcp_read_sock() was used instead of
tcp_read_skb() (sk_data_ready->strp_data_ready->tcp_read_sock)
tcp_read_sock() now still update 'sk->copied_seq', leading to duplicated
updates.
In summary, for strparser + SK_PASS, copied_seq is redundantly calculated
in both tcp_read_sock() and tcp_bpf_recvmsg_parser().
The issue causes incorrect copied_seq calculations, which prevent
correct data reads from the recv() interface in user-land.
Modifying tcp_read_sock() or strparser implementation directly is
unreasonable, as it is widely used in other modules.
Here, we introduce a method tcp_bpf_read_sock() to replace
'sk->sk_socket->ops->read_sock' (like 'tls_build_proto()' does in
tls_main.c). Such replacement action was also used in updating
tcp_bpf_prots in tcp_bpf.c, so it's not weird.
(Note that checkpatch.pl may complain missing 'const' qualifier when we
define the bpf-specified 'proto_ops', but we have to do because we need
update it).
Also we remove strparser check in tcp_eat_skb() since we implement custom
function tcp_bpf_read_sock() without copied_seq updating.
Since strparser currently supports only TCP, it's sufficient for 'ops' to
inherit inet_stream_ops.
In strparser's implementation, regardless of partial or full reads,
it completely clones the entire skb, allowing us to unconditionally
free skb in tcp_bpf_read_sock().
Signed-off-by: Jiayuan Chen <mrpre@163.com>
---
include/linux/skmsg.h | 1 +
include/net/tcp.h | 1 +
net/core/skmsg.c | 3 ++
net/ipv4/tcp.c | 2 +-
net/ipv4/tcp_bpf.c | 77 +++++++++++++++++++++++++++++++++++++++++--
5 files changed, 80 insertions(+), 4 deletions(-)
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index d9b03e0746e7..db1a6fff3cc1 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -112,6 +112,7 @@ struct sk_psock {
int (*psock_update_sk_prot)(struct sock *sk, struct sk_psock *psock,
bool restore);
struct proto *sk_proto;
+ const struct proto_ops *sk_proto_ops;
struct mutex work_mutex;
struct sk_psock_work_state work_state;
struct delayed_work work;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index e9b37b76e894..fb3215936ece 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -353,6 +353,7 @@ ssize_t tcp_splice_read(struct socket *sk, loff_t *ppos,
unsigned int flags);
struct sk_buff *tcp_stream_alloc_skb(struct sock *sk, gfp_t gfp,
bool force_schedule);
+void tcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb);
static inline void tcp_dec_quickack_mode(struct sock *sk)
{
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index e90fbab703b2..99dd75c9e689 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -702,6 +702,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
{
struct sk_psock *psock;
struct proto *prot;
+ const struct proto_ops *proto_ops;
write_lock_bh(&sk->sk_callback_lock);
@@ -722,9 +723,11 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
}
prot = READ_ONCE(sk->sk_prot);
+ proto_ops = likely(sk->sk_socket) ? sk->sk_socket->ops : NULL;
psock->sk = sk;
psock->eval = __SK_NONE;
psock->sk_proto = prot;
+ psock->sk_proto_ops = proto_ops;
psock->saved_unhash = prot->unhash;
psock->saved_destroy = prot->destroy;
psock->saved_close = prot->close;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0d704bda6c41..6a07d98017f7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1517,7 +1517,7 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied)
__tcp_cleanup_rbuf(sk, copied);
}
-static void tcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb)
+void tcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb)
{
__skb_unlink(skb, &sk->sk_receive_queue);
if (likely(skb->destructor == sock_rfree)) {
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 99cef92e6290..94553d2367a0 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -19,9 +19,6 @@ void tcp_eat_skb(struct sock *sk, struct sk_buff *skb)
if (!skb || !skb->len || !sk_is_tcp(sk))
return;
- if (skb_bpf_strparser(skb))
- return;
-
tcp = tcp_sk(sk);
copied = tcp->copied_seq + skb->len;
WRITE_ONCE(tcp->copied_seq, copied);
@@ -578,6 +575,50 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
return copied > 0 ? copied : err;
}
+static void sock_replace_proto_ops(struct sock *sk,
+ const struct proto_ops *proto_ops)
+{
+ if (sk->sk_socket)
+ WRITE_ONCE(sk->sk_socket->ops, proto_ops);
+}
+
+/* The tcp_bpf_read_sock() is an alternative implementation
+ * of tcp_read_sock(), except that it does not update copied_seq.
+ */
+static int tcp_bpf_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t recv_actor)
+{
+ struct sk_buff *skb;
+ int copied = 0;
+
+ if (sk->sk_state == TCP_LISTEN)
+ return -ENOTCONN;
+
+ while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) {
+ u8 tcp_flags;
+ int used;
+
+ WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk));
+ tcp_flags = TCP_SKB_CB(skb)->tcp_flags;
+ used = recv_actor(desc, skb, 0, skb->len);
+ /* strparser clone and consume all input skb
+ * even in waiting head or body status
+ */
+ tcp_eat_recv_skb(sk, skb);
+ if (used <= 0) {
+ if (!copied)
+ copied = used;
+ break;
+ }
+ copied += used;
+ if (!desc->count)
+ break;
+ if (tcp_flags & TCPHDR_FIN)
+ break;
+ }
+ return copied;
+}
+
enum {
TCP_BPF_IPV4,
TCP_BPF_IPV6,
@@ -595,6 +636,10 @@ enum {
static struct proto *tcpv6_prot_saved __read_mostly;
static DEFINE_SPINLOCK(tcpv6_prot_lock);
static struct proto tcp_bpf_prots[TCP_BPF_NUM_PROTS][TCP_BPF_NUM_CFGS];
+/* we do not use 'const' here because it will be polluted later.
+ * It may cause const check warning by script, just ignore it.
+ */
+static struct proto_ops tcp_bpf_proto_ops[TCP_BPF_NUM_PROTS];
static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
struct proto *base)
@@ -615,6 +660,13 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
prot[TCP_BPF_TXRX].recvmsg = tcp_bpf_recvmsg_parser;
}
+static void tcp_bpf_rebuild_proto_ops(struct proto_ops *ops,
+ const struct proto_ops *base)
+{
+ *ops = *base;
+ ops->read_sock = tcp_bpf_read_sock;
+}
+
static void tcp_bpf_check_v6_needs_rebuild(struct proto *ops)
{
if (unlikely(ops != smp_load_acquire(&tcpv6_prot_saved))) {
@@ -627,6 +679,19 @@ static void tcp_bpf_check_v6_needs_rebuild(struct proto *ops)
}
}
+static int __init tcp_bpf_build_proto_ops(void)
+{
+ /* We update ops separately for further scalability
+ * although v4 and v6 use same ops.
+ */
+ tcp_bpf_rebuild_proto_ops(&tcp_bpf_proto_ops[TCP_BPF_IPV4],
+ &inet_stream_ops);
+ tcp_bpf_rebuild_proto_ops(&tcp_bpf_proto_ops[TCP_BPF_IPV6],
+ &inet_stream_ops);
+ return 0;
+}
+late_initcall(tcp_bpf_build_proto_ops);
+
static int __init tcp_bpf_v4_build_proto(void)
{
tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);
@@ -648,6 +713,7 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
{
int family = sk->sk_family == AF_INET6 ? TCP_BPF_IPV6 : TCP_BPF_IPV4;
int config = psock->progs.msg_parser ? TCP_BPF_TX : TCP_BPF_BASE;
+ bool strp = psock->progs.stream_verdict && psock->progs.stream_parser;
if (psock->progs.stream_verdict || psock->progs.skb_verdict) {
config = (config == TCP_BPF_TX) ? TCP_BPF_TXRX : TCP_BPF_RX;
@@ -666,6 +732,7 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
sk->sk_write_space = psock->saved_write_space;
/* Pairs with lockless read in sk_clone_lock() */
sock_replace_proto(sk, psock->sk_proto);
+ sock_replace_proto_ops(sk, psock->sk_proto_ops);
}
return 0;
}
@@ -679,6 +746,10 @@ int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
/* Pairs with lockless read in sk_clone_lock() */
sock_replace_proto(sk, &tcp_bpf_prots[family][config]);
+
+ if (strp)
+ sock_replace_proto_ops(sk, &tcp_bpf_proto_ops[family]);
+
return 0;
}
EXPORT_SYMBOL_GPL(tcp_bpf_update_proto);
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH bpf 2/2] selftests/bpf: add strparser test for bpf
2024-12-05 12:02 [PATCH bpf 0/2] bpf: fix wrong copied_seq calculation and add tests Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 1/2] bpf: fix wrong copied_seq calculation Jiayuan Chen
@ 2024-12-05 12:02 ` Jiayuan Chen
1 sibling, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-12-05 12:02 UTC (permalink / raw)
To: bpf
Cc: martin.lau, ast, edumazet, jakub, davem, dsahern, kuba, pabeni,
netdev, linux-kernel, song, john.fastabend, andrii, mhal,
yonghong.song, daniel, xiyou.wangcong, Jiayuan Chen
Add test cases for bpf + strparser and separated them from
sockmap_basic. This is because we need to add more test cases for
strparser in the future.
Signed-off-by: Jiayuan Chen <mrpre@163.com>
---
.../selftests/bpf/prog_tests/sockmap_basic.c | 53 ----
.../selftests/bpf/prog_tests/sockmap_strp.c | 255 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_strp.c | 51 ++++
3 files changed, 306 insertions(+), 53 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/sockmap_strp.c
create mode 100644 tools/testing/selftests/bpf/progs/test_sockmap_strp.c
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index fdff0652d7ef..4c0eebc433d8 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -530,57 +530,6 @@ static void test_sockmap_skb_verdict_shutdown(void)
test_sockmap_pass_prog__destroy(skel);
}
-static void test_sockmap_stream_pass(void)
-{
- int zero = 0, sent, recvd;
- int verdict, parser;
- int err, map;
- int c = -1, p = -1;
- struct test_sockmap_pass_prog *pass = NULL;
- char snd[256] = "0123456789";
- char rcv[256] = "0";
-
- pass = test_sockmap_pass_prog__open_and_load();
- verdict = bpf_program__fd(pass->progs.prog_skb_verdict);
- parser = bpf_program__fd(pass->progs.prog_skb_parser);
- map = bpf_map__fd(pass->maps.sock_map_rx);
-
- err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0);
- if (!ASSERT_OK(err, "bpf_prog_attach stream parser"))
- goto out;
-
- err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
- if (!ASSERT_OK(err, "bpf_prog_attach stream verdict"))
- goto out;
-
- err = create_pair(AF_INET, SOCK_STREAM, &c, &p);
- if (err)
- goto out;
-
- /* sk_data_ready of 'p' will be replaced by strparser handler */
- err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST);
- if (!ASSERT_OK(err, "bpf_map_update_elem(p)"))
- goto out_close;
-
- /*
- * as 'prog_skb_parser' return the original skb len and
- * 'prog_skb_verdict' return SK_PASS, the kernel will just
- * pass it through to original socket 'p'
- */
- sent = xsend(c, snd, sizeof(snd), 0);
- ASSERT_EQ(sent, sizeof(snd), "xsend(c)");
-
- recvd = recv_timeout(p, rcv, sizeof(rcv), SOCK_NONBLOCK,
- IO_TIMEOUT_SEC);
- ASSERT_EQ(recvd, sizeof(rcv), "recv_timeout(p)");
-
-out_close:
- close(c);
- close(p);
-
-out:
- test_sockmap_pass_prog__destroy(pass);
-}
static void test_sockmap_skb_verdict_fionread(bool pass_prog)
{
@@ -1050,8 +999,6 @@ void test_sockmap_basic(void)
test_sockmap_progs_query(BPF_SK_SKB_VERDICT);
if (test__start_subtest("sockmap skb_verdict shutdown"))
test_sockmap_skb_verdict_shutdown();
- if (test__start_subtest("sockmap stream parser and verdict pass"))
- test_sockmap_stream_pass();
if (test__start_subtest("sockmap skb_verdict fionread"))
test_sockmap_skb_verdict_fionread(true);
if (test__start_subtest("sockmap skb_verdict fionread on drop"))
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_strp.c b/tools/testing/selftests/bpf/prog_tests/sockmap_strp.c
new file mode 100644
index 000000000000..1157a9410f87
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_strp.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <error.h>
+
+#include <test_progs.h>
+#include "sockmap_helpers.h"
+#include "test_skmsg_load_helpers.skel.h"
+#include "test_sockmap_strp.skel.h"
+#define STRP_HEAD_LEN 4
+#define STRP_BODY_LEN 6
+#define STRP_FULL_LEN (STRP_HEAD_LEN + STRP_BODY_LEN)
+
+static void test_sockmap_strp_partial_read(int family, int sotype)
+{
+ int zero = 0, recvd, off;
+ int verdict, parser;
+ int err, map;
+ int c = -1, p = -1;
+ struct test_sockmap_strp *strp = NULL;
+ char snd[STRP_FULL_LEN] = "head+body\0";
+ char rcv[256] = "0";
+
+ strp = test_sockmap_strp__open_and_load();
+ verdict = bpf_program__fd(strp->progs.prog_skb_verdict_pass);
+ parser = bpf_program__fd(strp->progs.prog_skb_parser_partial);
+ map = bpf_map__fd(strp->maps.sock_map);
+
+ err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream parser"))
+ goto out;
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream verdict"))
+ goto out;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (err)
+ goto out;
+
+ /* sk_data_ready of 'p' will be replaced by strparser handler */
+ err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem(zero, p)"))
+ goto out_close;
+
+ /* 1.1 send partial head, 1 byte header left*/
+ off = STRP_HEAD_LEN - 1;
+ xsend(c, snd, off, 0);
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT, 5);
+ if (!ASSERT_EQ(-1, recvd, "insufficient head, should no data recvd"))
+ goto out_close;
+
+ /* 1.2 send remaining head and body */
+ xsend(c, snd + off, STRP_FULL_LEN - off, 0);
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT, IO_TIMEOUT_SEC);
+ if (!ASSERT_EQ(recvd, STRP_FULL_LEN, "should full data recvd"))
+ goto out_close;
+
+ /* 2.1 send partial head, 1 byte header left */
+ off = STRP_HEAD_LEN - 1;
+ xsend(c, snd, off, 0);
+
+ /* 2.2 send remaining head and partial body, 1 byte body left */
+ xsend(c, snd + off, STRP_FULL_LEN - off - 1, 0);
+ off = STRP_FULL_LEN - 1;
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT, 1);
+ if (!ASSERT_EQ(-1, recvd, "insufficient body, should no data read"))
+ goto out_close;
+
+ /* 2.3 send remain body */
+ xsend(c, snd + off, STRP_FULL_LEN - off, 0);
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT, IO_TIMEOUT_SEC);
+ if (!ASSERT_EQ(recvd, STRP_FULL_LEN, "should full data recvd"))
+ goto out_close;
+
+out_close:
+ close(c);
+ close(p);
+
+out:
+ test_sockmap_strp__destroy(strp);
+}
+
+static void test_sockmap_strp_pass(int family, int sotype, bool fionread)
+{
+ int zero = 0, sent, recvd, avail;
+ int verdict, parser;
+ int err, map;
+ int c = -1, p = -1;
+ int read_cnt = 10, i;
+ struct test_sockmap_strp *strp = NULL;
+ char snd[11] = "0123456789\0";
+ char rcv[256] = "0";
+
+ strp = test_sockmap_strp__open_and_load();
+ verdict = bpf_program__fd(strp->progs.prog_skb_verdict_pass);
+ parser = bpf_program__fd(strp->progs.prog_skb_parser);
+ map = bpf_map__fd(strp->maps.sock_map);
+
+ err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream parser"))
+ goto out;
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream verdict"))
+ goto out;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (err)
+ goto out;
+
+ /* sk_data_ready of 'p' will be replaced by strparser handler */
+ err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem(p)"))
+ goto out_close;
+
+ /*
+ * Previously, we encountered issues such as deadlocks and
+ * sequence errors that resulted in the inability to read
+ * continuously. Therefore, we perform multiple iterations
+ * of testing here.
+ */
+ for (i = 0; i < read_cnt; i++) {
+ sent = xsend(c, snd, sizeof(snd), 0);
+ if (!ASSERT_EQ(sent, sizeof(snd), "xsend(c)"))
+ goto out_close;
+
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT,
+ IO_TIMEOUT_SEC);
+ if (!ASSERT_EQ(recvd, sizeof(snd), "recv_timeout(p)")
+ || !ASSERT_OK(memcmp(snd, rcv, sizeof(snd)),
+ "recv_timeout(p)"))
+ goto out_close;
+ }
+
+ if (fionread) {
+ sent = xsend(c, snd, sizeof(snd), 0);
+ if (!ASSERT_EQ(sent, sizeof(snd), "second xsend(c)"))
+ goto out_close;
+
+ err = ioctl(p, FIONREAD, &avail);
+ if (!ASSERT_OK(err, "ioctl(FIONREAD) error")
+ || ASSERT_EQ(avail, sizeof(snd), "ioctl(FIONREAD)"))
+ goto out_close;
+
+ recvd = recv_timeout(p, rcv, sizeof(rcv), MSG_DONTWAIT,
+ IO_TIMEOUT_SEC);
+ if (!ASSERT_EQ(recvd, sizeof(snd), "second recv_timeout(p)")
+ || ASSERT_OK(memcmp(snd, rcv, sizeof(snd)),
+ "second recv_timeout(p)"))
+ goto out_close;
+ }
+
+out_close:
+ close(c);
+ close(p);
+
+out:
+ test_sockmap_strp__destroy(strp);
+}
+
+static void test_sockmap_strp_verdict(int family, int sotype)
+{
+ int zero = 0, one = 1, sent, recvd, off, total_sent;
+ int verdict, parser;
+ int err, map;
+ int c0 = -1, p0 = -1, c1 = -1, p1 = -1;
+ struct test_sockmap_strp *strp = NULL;
+ char snd[11] = "0123456789\0";
+ char rcv[256] = "0";
+
+ strp = test_sockmap_strp__open_and_load();
+ verdict = bpf_program__fd(strp->progs.prog_skb_verdict);
+ parser = bpf_program__fd(strp->progs.prog_skb_parser);
+ map = bpf_map__fd(strp->maps.sock_map);
+
+ err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream parser"))
+ goto out;
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream verdict"))
+ goto out;
+
+ /* We simulate a reverse proxy server.
+ * When p0 receives data from c0, we forward it to p1.
+ * From p1's perspective, it will consider this data
+ * as being sent by c1.
+ */
+ err = create_socket_pairs(family, sotype, &c0, &c1, &p0, &p1);
+ if (!ASSERT_OK(err, "create_socket_pairs()"))
+ goto out;
+
+ err = bpf_map_update_elem(map, &zero, &p0, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem(p0)"))
+ goto out_close;
+
+ err = bpf_map_update_elem(map, &one, &c1, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem(c1)"))
+ goto out_close;
+
+ total_sent = sizeof(snd);
+ sent = xsend(c0, snd, total_sent, 0);
+ if (!ASSERT_EQ(sent, total_sent, "xsend(c0)"))
+ goto out_close;
+
+ recvd = recv_timeout(p1, rcv, sizeof(rcv), MSG_DONTWAIT,
+ IO_TIMEOUT_SEC);
+ if (!ASSERT_EQ(recvd, total_sent, "recv_timeout(p1)")
+ || !ASSERT_OK(memcmp(snd, rcv, total_sent),
+ "received data does not match the sent data"))
+ goto out_close;
+
+ /* send again to ensure the stream is functioning correctly. */
+ total_sent = sizeof(snd);
+ sent = xsend(c0, snd, total_sent, 0);
+ if (!ASSERT_EQ(sent, total_sent, "second xsend(c0)"))
+ goto out_close;
+
+ /* partial read */
+ off = total_sent/2;
+ recvd = recv_timeout(p1, rcv, off, MSG_DONTWAIT,
+ IO_TIMEOUT_SEC);
+ recvd += recv_timeout(p1, rcv + off, sizeof(rcv) - off, MSG_DONTWAIT,
+ IO_TIMEOUT_SEC);
+
+ if (!ASSERT_EQ(recvd, total_sent, "partial recv_timeout(p1)")
+ || !ASSERT_OK(memcmp(snd, rcv, total_sent),
+ "partial received data does not match the sent data"))
+ goto out_close;
+
+out_close:
+ close(c0);
+ close(c1);
+ close(p0);
+ close(p1);
+out:
+ test_sockmap_strp__destroy(strp);
+}
+
+void test_sockmap_strp(void)
+{
+ if (test__start_subtest("sockmap strp tcp pass"))
+ test_sockmap_strp_pass(AF_INET, SOCK_STREAM, false);
+ if (test__start_subtest("sockmap strp tcp v6 pass"))
+ test_sockmap_strp_pass(AF_INET6, SOCK_STREAM, false);
+ if (test__start_subtest("sockmap strp tcp pass fionread"))
+ test_sockmap_strp_pass(AF_INET, SOCK_STREAM, true);
+ if (test__start_subtest("sockmap strp tcp v6 pass fionread"))
+ test_sockmap_strp_pass(AF_INET6, SOCK_STREAM, true);
+ if (test__start_subtest("sockmap strp tcp verdict"))
+ test_sockmap_strp_verdict(AF_INET, SOCK_STREAM);
+ if (test__start_subtest("sockmap strp tcp v6 verdict"))
+ test_sockmap_strp_verdict(AF_INET6, SOCK_STREAM);
+ if (test__start_subtest("sockmap strp tcp partial read"))
+ test_sockmap_strp_partial_read(AF_INET, SOCK_STREAM);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_strp.c b/tools/testing/selftests/bpf/progs/test_sockmap_strp.c
new file mode 100644
index 000000000000..db2f3b6c87ba
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_strp.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_endian.h>
+
+struct {
+ __uint(type, BPF_MAP_TYPE_SOCKMAP);
+ __uint(max_entries, 20);
+ __type(key, int);
+ __type(value, int);
+} sock_map SEC(".maps");
+
+
+SEC("sk_skb/stream_verdict")
+int prog_skb_verdict_pass(struct __sk_buff *skb)
+{
+ return SK_PASS;
+}
+
+
+SEC("sk_skb/stream_verdict")
+int prog_skb_verdict(struct __sk_buff *skb)
+{
+ __u32 one = 1;
+
+ return bpf_sk_redirect_map(skb, &sock_map, one, 0);
+}
+
+SEC("sk_skb/stream_parser")
+int prog_skb_parser(struct __sk_buff *skb)
+{
+ return skb->len;
+}
+
+SEC("sk_skb/stream_parser")
+int prog_skb_parser_partial(struct __sk_buff *skb)
+{
+ /* agreement with the test program on a 4-byte size header
+ * and 6-byte body.
+ */
+ if (skb->len < 4) {
+ /* need more header to determine full length */
+ return 0;
+ }
+ /* return full length decoded from header.
+ * the return value may be larger than skb->len which
+ * means framework must wait body coming.
+ */
+ return 10;
+}
+char _license[] SEC("license") = "GPL";
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-12-05 12:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-05 12:02 [PATCH bpf 0/2] bpf: fix wrong copied_seq calculation and add tests Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 1/2] bpf: fix wrong copied_seq calculation Jiayuan Chen
2024-12-05 12:02 ` [PATCH bpf 2/2] selftests/bpf: add strparser test for bpf Jiayuan Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox