* [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked
@ 2025-04-07 13:40 Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
0 siblings, 2 replies; 12+ messages in thread
From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw)
To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel,
Breno Leitao, kernel-team
Meta has been using BPF programs to monitor tcp_sendmsg() for years,
indicating significant interest in observing this important
functionality. Adding a proper tracepoint provides a stable API for all
users who need visibility into TCP message transmission.
David Ahern is using a similar functionality with a custom patch[1]. So,
this means we have more than a single use case for this request, and it
might be a good idea to have such feature upstream.
Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Change to a full tracepoint inside tcp_sendmsg_locked(), heavily
inspired in David's patch
- Link to v1: https://lore.kernel.org/r/20250224-tcpsendmsg-v1-1-bac043c59cc8@debian.org
---
Breno Leitao (2):
net: pass const to msg_data_left()
trace: tcp: Add tracepoint for tcp_sendmsg_locked()
include/linux/socket.h | 2 +-
include/trace/events/tcp.h | 24 ++++++++++++++++++++++++
net/ipv4/tcp.c | 2 ++
3 files changed, 27 insertions(+), 1 deletion(-)
---
base-commit: 1a9239bb4253f9076b5b4b2a1a4e8d7defd77a95
change-id: 20250224-tcpsendmsg-4f0a236751d7
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH net-next v2 1/2] net: pass const to msg_data_left() 2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao @ 2025-04-07 13:40 ` Breno Leitao 2025-04-08 0:53 ` Kuniyuki Iwashima 2025-04-08 14:20 ` Eric Dumazet 2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao 1 sibling, 2 replies; 12+ messages in thread From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw) To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel, Breno Leitao, kernel-team The msg_data_left() function doesn't modify the struct msghdr parameter, so mark it as const. This allows the function to be used with const references, improving type safety and making the API more flexible. Signed-off-by: Breno Leitao <leitao@debian.org> --- include/linux/socket.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/socket.h b/include/linux/socket.h index c3322eb3d6865..3b262487ec060 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -168,7 +168,7 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr *__msg, struct cmsghdr return __cmsg_nxthdr(__msg->msg_control, __msg->msg_controllen, __cmsg); } -static inline size_t msg_data_left(struct msghdr *msg) +static inline size_t msg_data_left(const struct msghdr *msg) { return iov_iter_count(&msg->msg_iter); } -- 2.47.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] net: pass const to msg_data_left() 2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao @ 2025-04-08 0:53 ` Kuniyuki Iwashima 2025-04-08 14:20 ` Eric Dumazet 1 sibling, 0 replies; 12+ messages in thread From: Kuniyuki Iwashima @ 2025-04-08 0:53 UTC (permalink / raw) To: leitao Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song From: Breno Leitao <leitao@debian.org> Date: Mon, 07 Apr 2025 06:40:43 -0700 > The msg_data_left() function doesn't modify the struct msghdr parameter, > so mark it as const. This allows the function to be used with const > references, improving type safety and making the API more flexible. > > Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] net: pass const to msg_data_left() 2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao 2025-04-08 0:53 ` Kuniyuki Iwashima @ 2025-04-08 14:20 ` Eric Dumazet 1 sibling, 0 replies; 12+ messages in thread From: Eric Dumazet @ 2025-04-08 14:20 UTC (permalink / raw) To: Breno Leitao Cc: David Ahern, Neal Cardwell, Kuniyuki Iwashima, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman, Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel, kernel-team On Mon, Apr 7, 2025 at 3:40 PM Breno Leitao <leitao@debian.org> wrote: > > The msg_data_left() function doesn't modify the struct msghdr parameter, > so mark it as const. This allows the function to be used with const > references, improving type safety and making the API more flexible. > > Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Eric Dumazet <edumazet@google.com> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao 2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao @ 2025-04-07 13:40 ` Breno Leitao 2025-04-08 1:00 ` Kuniyuki Iwashima 2025-04-08 1:05 ` Kuniyuki Iwashima 1 sibling, 2 replies; 12+ messages in thread From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw) To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel, Breno Leitao, kernel-team Add a tracepoint to monitor TCP send operations, enabling detailed visibility into TCP message transmission. Create a new tracepoint within the tcp_sendmsg_locked function, capturing traditional fields along with size_goal, which indicates the optimal data size for a single TCP segment. Additionally, a reference to the struct sock sk is passed, allowing direct access for BPF programs. The implementation is largely based on David's patch and suggestions. The implementation is largely based on David's patch[1] and suggestions. Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1] Signed-off-by: Breno Leitao <leitao@debian.org> --- include/trace/events/tcp.h | 24 ++++++++++++++++++++++++ net/ipv4/tcp.c | 2 ++ 2 files changed, 26 insertions(+) diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h index 1a40c41ff8c30..cab25504c4f9d 100644 --- a/include/trace/events/tcp.h +++ b/include/trace/events/tcp.h @@ -259,6 +259,30 @@ TRACE_EVENT(tcp_retransmit_synack, __entry->saddr_v6, __entry->daddr_v6) ); +TRACE_EVENT(tcp_sendmsg_locked, + TP_PROTO(const struct sock *sk, const struct msghdr *msg, + const struct sk_buff *skb, int size_goal), + + TP_ARGS(sk, msg, skb, size_goal), + + TP_STRUCT__entry( + __field(const void *, skb_addr) + __field(int, skb_len) + __field(int, msg_left) + __field(int, size_goal) + ), + + TP_fast_assign( + __entry->skb_addr = skb; + __entry->skb_len = skb ? skb->len : 0; + __entry->msg_left = msg_data_left(msg); + __entry->size_goal = size_goal; + ), + + TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d", + __entry->skb_addr, __entry->skb_len, __entry->msg_left, + __entry->size_goal)); + DECLARE_TRACE(tcp_cwnd_reduction_tp, TP_PROTO(const struct sock *sk, int newly_acked_sacked, int newly_lost, int flag), diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index ea8de00f669d0..270ce2c8c2d54 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (skb) copy = size_goal - skb->len; + trace_tcp_sendmsg_locked(sk, msg, skb, size_goal); + if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) { bool first_skb; -- 2.47.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao @ 2025-04-08 1:00 ` Kuniyuki Iwashima 2025-04-08 14:27 ` Breno Leitao 2025-04-08 1:05 ` Kuniyuki Iwashima 1 sibling, 1 reply; 12+ messages in thread From: Kuniyuki Iwashima @ 2025-04-08 1:00 UTC (permalink / raw) To: leitao Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song From: Breno Leitao <leitao@debian.org> Date: Mon, 07 Apr 2025 06:40:44 -0700 > Add a tracepoint to monitor TCP send operations, enabling detailed > visibility into TCP message transmission. > > Create a new tracepoint within the tcp_sendmsg_locked function, > capturing traditional fields along with size_goal, which indicates the > optimal data size for a single TCP segment. Additionally, a reference to > the struct sock sk is passed, allowing direct access for BPF programs. > The implementation is largely based on David's patch and suggestions. > > The implementation is largely based on David's patch[1] and suggestions. nit: duplicate sentences. > > Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1] > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > include/trace/events/tcp.h | 24 ++++++++++++++++++++++++ > net/ipv4/tcp.c | 2 ++ > 2 files changed, 26 insertions(+) > > diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h > index 1a40c41ff8c30..cab25504c4f9d 100644 > --- a/include/trace/events/tcp.h > +++ b/include/trace/events/tcp.h > @@ -259,6 +259,30 @@ TRACE_EVENT(tcp_retransmit_synack, > __entry->saddr_v6, __entry->daddr_v6) > ); > > +TRACE_EVENT(tcp_sendmsg_locked, > + TP_PROTO(const struct sock *sk, const struct msghdr *msg, > + const struct sk_buff *skb, int size_goal), > + > + TP_ARGS(sk, msg, skb, size_goal), > + > + TP_STRUCT__entry( > + __field(const void *, skb_addr) > + __field(int, skb_len) > + __field(int, msg_left) > + __field(int, size_goal) > + ), > + > + TP_fast_assign( > + __entry->skb_addr = skb; > + __entry->skb_len = skb ? skb->len : 0; > + __entry->msg_left = msg_data_left(msg); > + __entry->size_goal = size_goal; > + ), > + > + TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d", > + __entry->skb_addr, __entry->skb_len, __entry->msg_left, > + __entry->size_goal)); > + > DECLARE_TRACE(tcp_cwnd_reduction_tp, > TP_PROTO(const struct sock *sk, int newly_acked_sacked, > int newly_lost, int flag), > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index ea8de00f669d0..270ce2c8c2d54 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) > if (skb) > copy = size_goal - skb->len; > > + trace_tcp_sendmsg_locked(sk, msg, skb, size_goal); skb could be NULL, so I think raw_tp_null_args[] needs to be updated. Maybe try attaching a bpf prog that dereferences skb unconditionally and see if the bpf verifier rejects it. See this commit for the similar issue: commit 5da7e15fb5a12e78de974d8908f348e279922ce9 Author: Kuniyuki Iwashima <kuniyu@amazon.com> Date: Fri Jan 31 19:01:42 2025 -0800 net: Add rx_skb of kfree_skb to raw_tp_null_args[]. > + > if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) { > bool first_skb; > > > -- > 2.47.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-08 1:00 ` Kuniyuki Iwashima @ 2025-04-08 14:27 ` Breno Leitao 2025-04-08 15:16 ` David Ahern 0 siblings, 1 reply; 12+ messages in thread From: Breno Leitao @ 2025-04-08 14:27 UTC (permalink / raw) To: Kuniyuki Iwashima Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song Hello Kuniyuki, On Mon, Apr 07, 2025 at 06:00:35PM -0700, Kuniyuki Iwashima wrote: > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index ea8de00f669d0..270ce2c8c2d54 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) > > if (skb) > > copy = size_goal - skb->len; > > > > + trace_tcp_sendmsg_locked(sk, msg, skb, size_goal); > > skb could be NULL, so I think raw_tp_null_args[] needs to be updated. > > Maybe try attaching a bpf prog that dereferences skb unconditionally > and see if the bpf verifier rejects it. I've been trying to dereference skb (while 0), and bpf verifier rejects it. Here is the code I wrote to test: SEC("tracepoint/tcp/tcp_sendmsg_locked") int bpf_tcp_sendmsg_locked(struct tcp_sendmsg_locked_args *ctx) { bpf_printk("TCP: skb_addr %p skb_len %d msg_left %d size_goal %d", ctx->skb_addr, ctx->skb_len, ctx->msg_left, ctx->size_goal); return 0; } And it matches the tracepoint, but, trying to dereference skb_addr fails in all forms. I tried with a proper dereference, or, something as simple as the following, and the program is not loaded. bpf_printk("deref %d\n", *(int *) ctx->skb_addr); Here is all it returns. libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; int bpf_tcp_sendmsg_locked(struct tcp_sendmsg_locked_args *ctx) @ tcp_sendmsg_locked_bpf.c:16 0: (bf) r6 = r1 ; R1=ctx() R6_w=ctx() ; bpf_printk("TCP: skb_addr %p skb_len %d msg_left %d size_goal %d", @ tcp_sendmsg_locked_bpf.c:19 1: (79) r1 = *(u64 *)(r6 +8) ; R1_w=scalar() R6_w=ctx() 2: (7b) *(u64 *)(r10 -32) = r1 ; R1_w=scalar(id=1) R10=fp0 fp-32_w=scalar(id=1) 3: (61) r1 = *(u32 *)(r6 +16) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx() 4: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 5: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 6: (7b) *(u64 *)(r10 -24) = r1 ; R1_w=scalar(id=2,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-24_w=scalar(id=2,smin=0xffffffff80000000,smax=0x7fffffff) 7: (61) r1 = *(u32 *)(r6 +20) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx() 8: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 9: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 10: (7b) *(u64 *)(r10 -16) = r1 ; R1_w=scalar(id=3,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-16_w=scalar(id=3,smin=0xffffffff80000000,smax=0x7fffffff) 11: (61) r1 = *(u32 *)(r6 +24) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx() 12: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 13: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 14: (7b) *(u64 *)(r10 -8) = r1 ; R1_w=scalar(id=4,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-8_w=scalar(id=4,smin=0xffffffff80000000,smax=0x7fffffff) 15: (bf) r3 = r10 ; R3_w=fp0 R10=fp0 16: (07) r3 += -32 ; R3_w=fp-32 17: (18) r1 = 0xff1100010965cdd8 ; R1_w=map_value(map=tcp_send.rodata,ks=4,vs=63) 19: (b7) r2 = 53 ; R2_w=53 20: (b7) r4 = 32 ; R4_w=32 21: (85) call bpf_trace_vprintk#177 ; R0_w=scalar() ; bpf_printk("deref %d\n", *(int *) ctx->skb_addr); @ tcp_sendmsg_locked_bpf.c:22 22: (79) r1 = *(u64 *)(r6 +8) ; R1_w=scalar() R6_w=ctx() 23: (61) r3 = *(u32 *)(r1 +0) R1 invalid mem access 'scalar' processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13 libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o' Failed to load BPF object: -13 I've pushed this example to the following URL, if you want to experiment as well: https://github.com/leitao/debug/blob/main/bpf/tracepoint/tcp_sendmsg_locked_bpf.c > See this commit for the similar issue: > > commit 5da7e15fb5a12e78de974d8908f348e279922ce9 > Author: Kuniyuki Iwashima <kuniyu@amazon.com> > Date: Fri Jan 31 19:01:42 2025 -0800 > > net: Add rx_skb of kfree_skb to raw_tp_null_args[]. > Thanks for the heads-up. I can populate raw_tp_null_args with this new tracepoint function, if that is the right thing to do, even without being able to reproduce the issue above. Thanks for the review, --breno ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-08 14:27 ` Breno Leitao @ 2025-04-08 15:16 ` David Ahern 2025-04-08 17:01 ` Breno Leitao 0 siblings, 1 reply; 12+ messages in thread From: David Ahern @ 2025-04-08 15:16 UTC (permalink / raw) To: Breno Leitao, Kuniyuki Iwashima Cc: davem, edumazet, horms, kernel-team, kuba, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song On 4/8/25 8:27 AM, Breno Leitao wrote: > > SEC("tracepoint/tcp/tcp_sendmsg_locked") Try `raw_tracepoint/tcp/tcp_sendmsg_locked`. This is the form I use for my tracepoint based packet capture (not tied to this tracepoint, but traces inside our driver) and it works fine. As suggested, you might need to update raw_tp_null_args ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-08 15:16 ` David Ahern @ 2025-04-08 17:01 ` Breno Leitao 2025-04-08 17:12 ` Kuniyuki Iwashima 0 siblings, 1 reply; 12+ messages in thread From: Breno Leitao @ 2025-04-08 17:01 UTC (permalink / raw) To: David Ahern Cc: Kuniyuki Iwashima, davem, edumazet, horms, kernel-team, kuba, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote: > On 4/8/25 8:27 AM, Breno Leitao wrote: > > > > SEC("tracepoint/tcp/tcp_sendmsg_locked") > > Try `raw_tracepoint/tcp/tcp_sendmsg_locked`. > > This is the form I use for my tracepoint based packet capture (not tied > to this tracepoint, but traces inside our driver) and it works fine. Thanks. I was not able to get this crashing as well. In fact, the following program fails to be loaded: SEC("raw_tracepoint/tcp/tcp_sendmsg_locked") int bpf_tcp_sendmsg_locked(struct bpf_raw_tracepoint_args *ctx) { void *skb_addr = (void *) ctx->args[0]; bpf_printk("deref %d\n", *(int *) skb_addr); return 0; } libbpf refuses to load it, and drumps: libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; void *skb_addr = (void *) ctx->args[0]; @ tcp_sendmsg_locked_bpf.c:18 0: (79) r1 = *(u64 *)(r1 +0) ; R1_w=scalar() ; bpf_printk("deref %d\n", *(int *) skb_addr); @ tcp_sendmsg_locked_bpf.c:20 1: (61) r3 = *(u32 *)(r1 +0) R1 invalid mem access 'scalar' processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13 libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o' Failed to load BPF object: -13 > As suggested, you might need to update raw_tp_null_args Thanks for confirming it. I will update raw_tp_null_args, assuming that the problem exists but I am failing to reproduce it. I will send an updated version soon. Thanks --breno ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-08 17:01 ` Breno Leitao @ 2025-04-08 17:12 ` Kuniyuki Iwashima 2025-04-08 18:06 ` Breno Leitao 0 siblings, 1 reply; 12+ messages in thread From: Kuniyuki Iwashima @ 2025-04-08 17:12 UTC (permalink / raw) To: leitao Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song From: Breno Leitao <leitao@debian.org> Date: Tue, 8 Apr 2025 10:01:05 -0700 > On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote: > > On 4/8/25 8:27 AM, Breno Leitao wrote: > > > > > > SEC("tracepoint/tcp/tcp_sendmsg_locked") > > > > Try `raw_tracepoint/tcp/tcp_sendmsg_locked`. > > > > This is the form I use for my tracepoint based packet capture (not tied > > to this tracepoint, but traces inside our driver) and it works fine. > > Thanks. I was not able to get this crashing as well. In fact, the > following program fails to be loaded: > > SEC("raw_tracepoint/tcp/tcp_sendmsg_locked") Try SEC("tp_btf/tcp_sendmsg_locked") and access the raw argument (struct sk_buff *skb) instead of bpf_raw_tracepoint_args. The original report used it and I was able to reproduce it at that time. https://lore.kernel.org/netdev/Z50zebTRzI962e6X@debian.debian/ > int bpf_tcp_sendmsg_locked(struct bpf_raw_tracepoint_args *ctx) > { > void *skb_addr = (void *) ctx->args[0]; > > bpf_printk("deref %d\n", *(int *) skb_addr); > > return 0; > } > > libbpf refuses to load it, and drumps: > > libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied > libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG -- > 0: R1=ctx() R10=fp0 > ; void *skb_addr = (void *) ctx->args[0]; @ tcp_sendmsg_locked_bpf.c:18 > 0: (79) r1 = *(u64 *)(r1 +0) ; R1_w=scalar() > ; bpf_printk("deref %d\n", *(int *) skb_addr); @ tcp_sendmsg_locked_bpf.c:20 > 1: (61) r3 = *(u32 *)(r1 +0) > R1 invalid mem access 'scalar' > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > -- END PROG LOAD LOG -- > libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13 > libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o' > Failed to load BPF object: -13 > > > As suggested, you might need to update raw_tp_null_args > > Thanks for confirming it. I will update raw_tp_null_args, assuming that > the problem exists but I am failing to reproduce it. > > I will send an updated version soon. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-08 17:12 ` Kuniyuki Iwashima @ 2025-04-08 18:06 ` Breno Leitao 0 siblings, 0 replies; 12+ messages in thread From: Breno Leitao @ 2025-04-08 18:06 UTC (permalink / raw) To: Kuniyuki Iwashima Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song On Tue, Apr 08, 2025 at 10:12:14AM -0700, Kuniyuki Iwashima wrote: > From: Breno Leitao <leitao@debian.org> > Date: Tue, 8 Apr 2025 10:01:05 -0700 > > On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote: > > > On 4/8/25 8:27 AM, Breno Leitao wrote: > > > > > > > > SEC("tracepoint/tcp/tcp_sendmsg_locked") > > > > > > Try `raw_tracepoint/tcp/tcp_sendmsg_locked`. > > > > > > This is the form I use for my tracepoint based packet capture (not tied > > > to this tracepoint, but traces inside our driver) and it works fine. > > > > Thanks. I was not able to get this crashing as well. In fact, the > > following program fails to be loaded: > > > > SEC("raw_tracepoint/tcp/tcp_sendmsg_locked") > > Try SEC("tp_btf/tcp_sendmsg_locked") and access the raw argument > (struct sk_buff *skb) instead of bpf_raw_tracepoint_args. Nice, I was able to crash the host, with the following code: SEC("tp_btf/tcp_sendmsg_locked") int BPF_PROG(tcp_sendmsg_locked, struct sock *sk, struct msghdr *msg, struct sk_buff *skb, int size_goal) { bpf_printk("skb->len %d\n", skb->len); return 0; } This is the unusually expected stacktrace. :-) BUG: kernel NULL pointer dereference, address: 0000000000000070 #PF: supervisor read access in kernel mode "virtme-ng" 11:03 08-Apr-25 #PF: error_code(0x0000) - not-present page PGD 10ca78067 P4D 0 Oops: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI CPU: 13 UID: 0 PID: 1020 Comm: nc Tainted: G E N 6.14.0-upstream-05880-g14fbb7a1a500 #73 PREEMPT(undef) Tainted: [E]=UNSIGNED_MODULE, [N]=TEST Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:bpf_prog_5b31430a4390397c_tcp_sendmsg_locked+0x18/0x37 Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa 48 8b 7f 10 <8b> 57 70 48 bf d8 d9 03 06 01 00 11 ff be 0d 00 00 00 e8 15 f4 4c RSP: 0018:ffa0000003c03bd0 EFLAGS: 00010282 RAX: 5aab7562e1de3200 RBX: ffa0000003be4000 RCX: 0000000000000018 RDX: 0000000000000000 RSI: ffa0000003be4048 RDI: 0000000000000000 RBP: ffa0000003c03bd0 R08: 000000000006043d R09: ffffffffffffffff R10: 0000000000000000 R11: ffffffffa000096c R12: ff11000104ae5b00 R13: ff1100010610a3c0 R14: ffffffff814d34ef R15: 0000000000000000 FS: 00007fd67d550740(0000) GS:ff110005a40a9000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000070 CR3: 000000010d9ec002 CR4: 0000000000771ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body+0xaf/0xc0 ? page_fault_oops+0x35b/0x3c0 ? do_user_addr_fault+0x6d4/0x730 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x5f/0xe0 ? asm_exc_page_fault+0x26/0x30 ? bpf_trace_run4+0xbf/0x240 ? 0xffffffffa000096c ? bpf_prog_5b31430a4390397c_tcp_sendmsg_locked+0x18/0x37 bpf_trace_run4+0x14c/0x240 ? trace_event_raw_event_tcp_sendmsg_locked+0xc3/0xf0 __traceiter_tcp_sendmsg_locked+0x44/0x60 tcp_sendmsg_locked+0x10c8/0x15b0 ? __local_bh_enable_ip+0x166/0x1c0 ? srso_alias_return_thunk+0x5/0xfbef5 tcp_sendmsg+0x2c/0x50 ? __pfx_inet6_sendmsg+0x10/0x10 sock_sendmsg_nosec+0xa0/0x100 __sys_sendto+0x1b4/0x1f0 __x64_sys_sendto+0x26/0x30 do_syscall_64+0x83/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() 2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao 2025-04-08 1:00 ` Kuniyuki Iwashima @ 2025-04-08 1:05 ` Kuniyuki Iwashima 1 sibling, 0 replies; 12+ messages in thread From: Kuniyuki Iwashima @ 2025-04-08 1:05 UTC (permalink / raw) To: leitao Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu, linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell, netdev, pabeni, rostedt, song, yonghong.song From: Breno Leitao <leitao@debian.org> Date: Mon, 07 Apr 2025 06:40:44 -0700 > + TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d", > + __entry->skb_addr, __entry->skb_len, __entry->msg_left, > + __entry->size_goal)); Also could you align these two lines as other events ? ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-04-08 18:06 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao 2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao 2025-04-08 0:53 ` Kuniyuki Iwashima 2025-04-08 14:20 ` Eric Dumazet 2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao 2025-04-08 1:00 ` Kuniyuki Iwashima 2025-04-08 14:27 ` Breno Leitao 2025-04-08 15:16 ` David Ahern 2025-04-08 17:01 ` Breno Leitao 2025-04-08 17:12 ` Kuniyuki Iwashima 2025-04-08 18:06 ` Breno Leitao 2025-04-08 1:05 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).