* [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked
@ 2025-04-07 13:40 Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
0 siblings, 2 replies; 12+ messages in thread
From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw)
To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel,
Breno Leitao, kernel-team
Meta has been using BPF programs to monitor tcp_sendmsg() for years,
indicating significant interest in observing this important
functionality. Adding a proper tracepoint provides a stable API for all
users who need visibility into TCP message transmission.
David Ahern is using a similar functionality with a custom patch[1]. So,
this means we have more than a single use case for this request, and it
might be a good idea to have such feature upstream.
Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Change to a full tracepoint inside tcp_sendmsg_locked(), heavily
inspired in David's patch
- Link to v1: https://lore.kernel.org/r/20250224-tcpsendmsg-v1-1-bac043c59cc8@debian.org
---
Breno Leitao (2):
net: pass const to msg_data_left()
trace: tcp: Add tracepoint for tcp_sendmsg_locked()
include/linux/socket.h | 2 +-
include/trace/events/tcp.h | 24 ++++++++++++++++++++++++
net/ipv4/tcp.c | 2 ++
3 files changed, 27 insertions(+), 1 deletion(-)
---
base-commit: 1a9239bb4253f9076b5b4b2a1a4e8d7defd77a95
change-id: 20250224-tcpsendmsg-4f0a236751d7
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-next v2 1/2] net: pass const to msg_data_left()
2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao
@ 2025-04-07 13:40 ` Breno Leitao
2025-04-08 0:53 ` Kuniyuki Iwashima
2025-04-08 14:20 ` Eric Dumazet
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
1 sibling, 2 replies; 12+ messages in thread
From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw)
To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel,
Breno Leitao, kernel-team
The msg_data_left() function doesn't modify the struct msghdr parameter,
so mark it as const. This allows the function to be used with const
references, improving type safety and making the API more flexible.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
include/linux/socket.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/socket.h b/include/linux/socket.h
index c3322eb3d6865..3b262487ec060 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -168,7 +168,7 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr *__msg, struct cmsghdr
return __cmsg_nxthdr(__msg->msg_control, __msg->msg_controllen, __cmsg);
}
-static inline size_t msg_data_left(struct msghdr *msg)
+static inline size_t msg_data_left(const struct msghdr *msg)
{
return iov_iter_count(&msg->msg_iter);
}
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
@ 2025-04-07 13:40 ` Breno Leitao
2025-04-08 1:00 ` Kuniyuki Iwashima
2025-04-08 1:05 ` Kuniyuki Iwashima
1 sibling, 2 replies; 12+ messages in thread
From: Breno Leitao @ 2025-04-07 13:40 UTC (permalink / raw)
To: David Ahern, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Song Liu, Yonghong Song, netdev, linux-kernel, linux-trace-kernel,
Breno Leitao, kernel-team
Add a tracepoint to monitor TCP send operations, enabling detailed
visibility into TCP message transmission.
Create a new tracepoint within the tcp_sendmsg_locked function,
capturing traditional fields along with size_goal, which indicates the
optimal data size for a single TCP segment. Additionally, a reference to
the struct sock sk is passed, allowing direct access for BPF programs.
The implementation is largely based on David's patch and suggestions.
The implementation is largely based on David's patch[1] and suggestions.
Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
Signed-off-by: Breno Leitao <leitao@debian.org>
---
include/trace/events/tcp.h | 24 ++++++++++++++++++++++++
net/ipv4/tcp.c | 2 ++
2 files changed, 26 insertions(+)
diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 1a40c41ff8c30..cab25504c4f9d 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -259,6 +259,30 @@ TRACE_EVENT(tcp_retransmit_synack,
__entry->saddr_v6, __entry->daddr_v6)
);
+TRACE_EVENT(tcp_sendmsg_locked,
+ TP_PROTO(const struct sock *sk, const struct msghdr *msg,
+ const struct sk_buff *skb, int size_goal),
+
+ TP_ARGS(sk, msg, skb, size_goal),
+
+ TP_STRUCT__entry(
+ __field(const void *, skb_addr)
+ __field(int, skb_len)
+ __field(int, msg_left)
+ __field(int, size_goal)
+ ),
+
+ TP_fast_assign(
+ __entry->skb_addr = skb;
+ __entry->skb_len = skb ? skb->len : 0;
+ __entry->msg_left = msg_data_left(msg);
+ __entry->size_goal = size_goal;
+ ),
+
+ TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d",
+ __entry->skb_addr, __entry->skb_len, __entry->msg_left,
+ __entry->size_goal));
+
DECLARE_TRACE(tcp_cwnd_reduction_tp,
TP_PROTO(const struct sock *sk, int newly_acked_sacked,
int newly_lost, int flag),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index ea8de00f669d0..270ce2c8c2d54 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
if (skb)
copy = size_goal - skb->len;
+ trace_tcp_sendmsg_locked(sk, msg, skb, size_goal);
+
if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
bool first_skb;
--
2.47.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] net: pass const to msg_data_left()
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
@ 2025-04-08 0:53 ` Kuniyuki Iwashima
2025-04-08 14:20 ` Eric Dumazet
1 sibling, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2025-04-08 0:53 UTC (permalink / raw)
To: leitao
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu,
linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
ncardwell, netdev, pabeni, rostedt, song, yonghong.song
From: Breno Leitao <leitao@debian.org>
Date: Mon, 07 Apr 2025 06:40:43 -0700
> The msg_data_left() function doesn't modify the struct msghdr parameter,
> so mark it as const. This allows the function to be used with const
> references, improving type safety and making the API more flexible.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
@ 2025-04-08 1:00 ` Kuniyuki Iwashima
2025-04-08 14:27 ` Breno Leitao
2025-04-08 1:05 ` Kuniyuki Iwashima
1 sibling, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2025-04-08 1:00 UTC (permalink / raw)
To: leitao
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu,
linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
ncardwell, netdev, pabeni, rostedt, song, yonghong.song
From: Breno Leitao <leitao@debian.org>
Date: Mon, 07 Apr 2025 06:40:44 -0700
> Add a tracepoint to monitor TCP send operations, enabling detailed
> visibility into TCP message transmission.
>
> Create a new tracepoint within the tcp_sendmsg_locked function,
> capturing traditional fields along with size_goal, which indicates the
> optimal data size for a single TCP segment. Additionally, a reference to
> the struct sock sk is passed, allowing direct access for BPF programs.
> The implementation is largely based on David's patch and suggestions.
>
> The implementation is largely based on David's patch[1] and suggestions.
nit: duplicate sentences.
>
> Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> include/trace/events/tcp.h | 24 ++++++++++++++++++++++++
> net/ipv4/tcp.c | 2 ++
> 2 files changed, 26 insertions(+)
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index 1a40c41ff8c30..cab25504c4f9d 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -259,6 +259,30 @@ TRACE_EVENT(tcp_retransmit_synack,
> __entry->saddr_v6, __entry->daddr_v6)
> );
>
> +TRACE_EVENT(tcp_sendmsg_locked,
> + TP_PROTO(const struct sock *sk, const struct msghdr *msg,
> + const struct sk_buff *skb, int size_goal),
> +
> + TP_ARGS(sk, msg, skb, size_goal),
> +
> + TP_STRUCT__entry(
> + __field(const void *, skb_addr)
> + __field(int, skb_len)
> + __field(int, msg_left)
> + __field(int, size_goal)
> + ),
> +
> + TP_fast_assign(
> + __entry->skb_addr = skb;
> + __entry->skb_len = skb ? skb->len : 0;
> + __entry->msg_left = msg_data_left(msg);
> + __entry->size_goal = size_goal;
> + ),
> +
> + TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d",
> + __entry->skb_addr, __entry->skb_len, __entry->msg_left,
> + __entry->size_goal));
> +
> DECLARE_TRACE(tcp_cwnd_reduction_tp,
> TP_PROTO(const struct sock *sk, int newly_acked_sacked,
> int newly_lost, int flag),
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index ea8de00f669d0..270ce2c8c2d54 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> if (skb)
> copy = size_goal - skb->len;
>
> + trace_tcp_sendmsg_locked(sk, msg, skb, size_goal);
skb could be NULL, so I think raw_tp_null_args[] needs to be updated.
Maybe try attaching a bpf prog that dereferences skb unconditionally
and see if the bpf verifier rejects it.
See this commit for the similar issue:
commit 5da7e15fb5a12e78de974d8908f348e279922ce9
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Fri Jan 31 19:01:42 2025 -0800
net: Add rx_skb of kfree_skb to raw_tp_null_args[].
> +
> if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
> bool first_skb;
>
>
> --
> 2.47.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
2025-04-08 1:00 ` Kuniyuki Iwashima
@ 2025-04-08 1:05 ` Kuniyuki Iwashima
1 sibling, 0 replies; 12+ messages in thread
From: Kuniyuki Iwashima @ 2025-04-08 1:05 UTC (permalink / raw)
To: leitao
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu,
linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
ncardwell, netdev, pabeni, rostedt, song, yonghong.song
From: Breno Leitao <leitao@debian.org>
Date: Mon, 07 Apr 2025 06:40:44 -0700
> + TP_printk("skb_addr %p skb_len %d msg_left %d size_goal %d",
> + __entry->skb_addr, __entry->skb_len, __entry->msg_left,
> + __entry->size_goal));
Also could you align these two lines as other events ?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 1/2] net: pass const to msg_data_left()
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
2025-04-08 0:53 ` Kuniyuki Iwashima
@ 2025-04-08 14:20 ` Eric Dumazet
1 sibling, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2025-04-08 14:20 UTC (permalink / raw)
To: Breno Leitao
Cc: David Ahern, Neal Cardwell, Kuniyuki Iwashima, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, David S. Miller,
Jakub Kicinski, Paolo Abeni, Simon Horman, Song Liu,
Yonghong Song, netdev, linux-kernel, linux-trace-kernel,
kernel-team
On Mon, Apr 7, 2025 at 3:40 PM Breno Leitao <leitao@debian.org> wrote:
>
> The msg_data_left() function doesn't modify the struct msghdr parameter,
> so mark it as const. This allows the function to be used with const
> references, improving type safety and making the API more flexible.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-08 1:00 ` Kuniyuki Iwashima
@ 2025-04-08 14:27 ` Breno Leitao
2025-04-08 15:16 ` David Ahern
0 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2025-04-08 14:27 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, linux-kernel,
linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell,
netdev, pabeni, rostedt, song, yonghong.song
Hello Kuniyuki,
On Mon, Apr 07, 2025 at 06:00:35PM -0700, Kuniyuki Iwashima wrote:
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index ea8de00f669d0..270ce2c8c2d54 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> > if (skb)
> > copy = size_goal - skb->len;
> >
> > + trace_tcp_sendmsg_locked(sk, msg, skb, size_goal);
>
> skb could be NULL, so I think raw_tp_null_args[] needs to be updated.
>
> Maybe try attaching a bpf prog that dereferences skb unconditionally
> and see if the bpf verifier rejects it.
I've been trying to dereference skb (while 0), and bpf verifier rejects
it.
Here is the code I wrote to test:
SEC("tracepoint/tcp/tcp_sendmsg_locked")
int bpf_tcp_sendmsg_locked(struct tcp_sendmsg_locked_args *ctx)
{
bpf_printk("TCP: skb_addr %p skb_len %d msg_left %d size_goal %d",
ctx->skb_addr, ctx->skb_len, ctx->msg_left, ctx->size_goal);
return 0;
}
And it matches the tracepoint, but, trying to dereference skb_addr fails
in all forms. I tried with a proper dereference, or, something as simple
as the following, and the program is not loaded.
bpf_printk("deref %d\n", *(int *) ctx->skb_addr);
Here is all it returns.
libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied
libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG --
0: R1=ctx() R10=fp0
; int bpf_tcp_sendmsg_locked(struct tcp_sendmsg_locked_args *ctx) @ tcp_sendmsg_locked_bpf.c:16
0: (bf) r6 = r1 ; R1=ctx() R6_w=ctx()
; bpf_printk("TCP: skb_addr %p skb_len %d msg_left %d size_goal %d", @ tcp_sendmsg_locked_bpf.c:19
1: (79) r1 = *(u64 *)(r6 +8) ; R1_w=scalar() R6_w=ctx()
2: (7b) *(u64 *)(r10 -32) = r1 ; R1_w=scalar(id=1) R10=fp0 fp-32_w=scalar(id=1)
3: (61) r1 = *(u32 *)(r6 +16) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx()
4: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
5: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
6: (7b) *(u64 *)(r10 -24) = r1 ; R1_w=scalar(id=2,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-24_w=scalar(id=2,smin=0xffffffff80000000,smax=0x7fffffff)
7: (61) r1 = *(u32 *)(r6 +20) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx()
8: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
9: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
10: (7b) *(u64 *)(r10 -16) = r1 ; R1_w=scalar(id=3,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-16_w=scalar(id=3,smin=0xffffffff80000000,smax=0x7fffffff)
11: (61) r1 = *(u32 *)(r6 +24) ; R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R6_w=ctx()
12: (67) r1 <<= 32 ; R1_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000))
13: (c7) r1 s>>= 32 ; R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
14: (7b) *(u64 *)(r10 -8) = r1 ; R1_w=scalar(id=4,smin=0xffffffff80000000,smax=0x7fffffff) R10=fp0 fp-8_w=scalar(id=4,smin=0xffffffff80000000,smax=0x7fffffff)
15: (bf) r3 = r10 ; R3_w=fp0 R10=fp0
16: (07) r3 += -32 ; R3_w=fp-32
17: (18) r1 = 0xff1100010965cdd8 ; R1_w=map_value(map=tcp_send.rodata,ks=4,vs=63)
19: (b7) r2 = 53 ; R2_w=53
20: (b7) r4 = 32 ; R4_w=32
21: (85) call bpf_trace_vprintk#177 ; R0_w=scalar()
; bpf_printk("deref %d\n", *(int *) ctx->skb_addr); @ tcp_sendmsg_locked_bpf.c:22
22: (79) r1 = *(u64 *)(r6 +8) ; R1_w=scalar() R6_w=ctx()
23: (61) r3 = *(u32 *)(r1 +0)
R1 invalid mem access 'scalar'
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13
libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o'
Failed to load BPF object: -13
I've pushed this example to the following URL, if you want to experiment
as well:
https://github.com/leitao/debug/blob/main/bpf/tracepoint/tcp_sendmsg_locked_bpf.c
> See this commit for the similar issue:
>
> commit 5da7e15fb5a12e78de974d8908f348e279922ce9
> Author: Kuniyuki Iwashima <kuniyu@amazon.com>
> Date: Fri Jan 31 19:01:42 2025 -0800
>
> net: Add rx_skb of kfree_skb to raw_tp_null_args[].
>
Thanks for the heads-up. I can populate raw_tp_null_args with this new
tracepoint function, if that is the right thing to do, even without
being able to reproduce the issue above.
Thanks for the review,
--breno
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-08 14:27 ` Breno Leitao
@ 2025-04-08 15:16 ` David Ahern
2025-04-08 17:01 ` Breno Leitao
0 siblings, 1 reply; 12+ messages in thread
From: David Ahern @ 2025-04-08 15:16 UTC (permalink / raw)
To: Breno Leitao, Kuniyuki Iwashima
Cc: davem, edumazet, horms, kernel-team, kuba, linux-kernel,
linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell,
netdev, pabeni, rostedt, song, yonghong.song
On 4/8/25 8:27 AM, Breno Leitao wrote:
>
> SEC("tracepoint/tcp/tcp_sendmsg_locked")
Try `raw_tracepoint/tcp/tcp_sendmsg_locked`.
This is the form I use for my tracepoint based packet capture (not tied
to this tracepoint, but traces inside our driver) and it works fine.
As suggested, you might need to update raw_tp_null_args
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-08 15:16 ` David Ahern
@ 2025-04-08 17:01 ` Breno Leitao
2025-04-08 17:12 ` Kuniyuki Iwashima
0 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2025-04-08 17:01 UTC (permalink / raw)
To: David Ahern
Cc: Kuniyuki Iwashima, davem, edumazet, horms, kernel-team, kuba,
linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
ncardwell, netdev, pabeni, rostedt, song, yonghong.song
On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote:
> On 4/8/25 8:27 AM, Breno Leitao wrote:
> >
> > SEC("tracepoint/tcp/tcp_sendmsg_locked")
>
> Try `raw_tracepoint/tcp/tcp_sendmsg_locked`.
>
> This is the form I use for my tracepoint based packet capture (not tied
> to this tracepoint, but traces inside our driver) and it works fine.
Thanks. I was not able to get this crashing as well. In fact, the
following program fails to be loaded:
SEC("raw_tracepoint/tcp/tcp_sendmsg_locked")
int bpf_tcp_sendmsg_locked(struct bpf_raw_tracepoint_args *ctx)
{
void *skb_addr = (void *) ctx->args[0];
bpf_printk("deref %d\n", *(int *) skb_addr);
return 0;
}
libbpf refuses to load it, and drumps:
libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied
libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG --
0: R1=ctx() R10=fp0
; void *skb_addr = (void *) ctx->args[0]; @ tcp_sendmsg_locked_bpf.c:18
0: (79) r1 = *(u64 *)(r1 +0) ; R1_w=scalar()
; bpf_printk("deref %d\n", *(int *) skb_addr); @ tcp_sendmsg_locked_bpf.c:20
1: (61) r3 = *(u32 *)(r1 +0)
R1 invalid mem access 'scalar'
processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13
libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o'
Failed to load BPF object: -13
> As suggested, you might need to update raw_tp_null_args
Thanks for confirming it. I will update raw_tp_null_args, assuming that
the problem exists but I am failing to reproduce it.
I will send an updated version soon.
Thanks
--breno
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-08 17:01 ` Breno Leitao
@ 2025-04-08 17:12 ` Kuniyuki Iwashima
2025-04-08 18:06 ` Breno Leitao
0 siblings, 1 reply; 12+ messages in thread
From: Kuniyuki Iwashima @ 2025-04-08 17:12 UTC (permalink / raw)
To: leitao
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, kuniyu,
linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
ncardwell, netdev, pabeni, rostedt, song, yonghong.song
From: Breno Leitao <leitao@debian.org>
Date: Tue, 8 Apr 2025 10:01:05 -0700
> On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote:
> > On 4/8/25 8:27 AM, Breno Leitao wrote:
> > >
> > > SEC("tracepoint/tcp/tcp_sendmsg_locked")
> >
> > Try `raw_tracepoint/tcp/tcp_sendmsg_locked`.
> >
> > This is the form I use for my tracepoint based packet capture (not tied
> > to this tracepoint, but traces inside our driver) and it works fine.
>
> Thanks. I was not able to get this crashing as well. In fact, the
> following program fails to be loaded:
>
> SEC("raw_tracepoint/tcp/tcp_sendmsg_locked")
Try SEC("tp_btf/tcp_sendmsg_locked") and access the raw argument
(struct sk_buff *skb) instead of bpf_raw_tracepoint_args.
The original report used it and I was able to reproduce it
at that time.
https://lore.kernel.org/netdev/Z50zebTRzI962e6X@debian.debian/
> int bpf_tcp_sendmsg_locked(struct bpf_raw_tracepoint_args *ctx)
> {
> void *skb_addr = (void *) ctx->args[0];
>
> bpf_printk("deref %d\n", *(int *) skb_addr);
>
> return 0;
> }
>
> libbpf refuses to load it, and drumps:
>
> libbpf: prog 'bpf_tcp_sendmsg_locked': BPF program load failed: Permission denied
> libbpf: prog 'bpf_tcp_sendmsg_locked': -- BEGIN PROG LOAD LOG --
> 0: R1=ctx() R10=fp0
> ; void *skb_addr = (void *) ctx->args[0]; @ tcp_sendmsg_locked_bpf.c:18
> 0: (79) r1 = *(u64 *)(r1 +0) ; R1_w=scalar()
> ; bpf_printk("deref %d\n", *(int *) skb_addr); @ tcp_sendmsg_locked_bpf.c:20
> 1: (61) r3 = *(u32 *)(r1 +0)
> R1 invalid mem access 'scalar'
> processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> -- END PROG LOAD LOG --
> libbpf: prog 'bpf_tcp_sendmsg_locked': failed to load: -13
> libbpf: failed to load object 'tcp_sendmsg_locked_bpf.o'
> Failed to load BPF object: -13
>
> > As suggested, you might need to update raw_tp_null_args
>
> Thanks for confirming it. I will update raw_tp_null_args, assuming that
> the problem exists but I am failing to reproduce it.
>
> I will send an updated version soon.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked()
2025-04-08 17:12 ` Kuniyuki Iwashima
@ 2025-04-08 18:06 ` Breno Leitao
0 siblings, 0 replies; 12+ messages in thread
From: Breno Leitao @ 2025-04-08 18:06 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, dsahern, edumazet, horms, kernel-team, kuba, linux-kernel,
linux-trace-kernel, mathieu.desnoyers, mhiramat, ncardwell,
netdev, pabeni, rostedt, song, yonghong.song
On Tue, Apr 08, 2025 at 10:12:14AM -0700, Kuniyuki Iwashima wrote:
> From: Breno Leitao <leitao@debian.org>
> Date: Tue, 8 Apr 2025 10:01:05 -0700
> > On Tue, Apr 08, 2025 at 09:16:51AM -0600, David Ahern wrote:
> > > On 4/8/25 8:27 AM, Breno Leitao wrote:
> > > >
> > > > SEC("tracepoint/tcp/tcp_sendmsg_locked")
> > >
> > > Try `raw_tracepoint/tcp/tcp_sendmsg_locked`.
> > >
> > > This is the form I use for my tracepoint based packet capture (not tied
> > > to this tracepoint, but traces inside our driver) and it works fine.
> >
> > Thanks. I was not able to get this crashing as well. In fact, the
> > following program fails to be loaded:
> >
> > SEC("raw_tracepoint/tcp/tcp_sendmsg_locked")
>
> Try SEC("tp_btf/tcp_sendmsg_locked") and access the raw argument
> (struct sk_buff *skb) instead of bpf_raw_tracepoint_args.
Nice, I was able to crash the host, with the following code:
SEC("tp_btf/tcp_sendmsg_locked")
int BPF_PROG(tcp_sendmsg_locked, struct sock *sk, struct msghdr *msg, struct sk_buff *skb, int size_goal)
{
bpf_printk("skb->len %d\n", skb->len);
return 0;
}
This is the unusually expected stacktrace. :-)
BUG: kernel NULL pointer dereference, address: 0000000000000070
#PF: supervisor read access in kernel mode "virtme-ng" 11:03 08-Apr-25
#PF: error_code(0x0000) - not-present page
PGD 10ca78067 P4D 0
Oops: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
CPU: 13 UID: 0 PID: 1020 Comm: nc Tainted: G E N 6.14.0-upstream-05880-g14fbb7a1a500 #73 PREEMPT(undef)
Tainted: [E]=UNSIGNED_MODULE, [N]=TEST
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
RIP: 0010:bpf_prog_5b31430a4390397c_tcp_sendmsg_locked+0x18/0x37
Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa 48 8b 7f 10 <8b> 57 70 48 bf d8 d9 03 06 01 00 11 ff be 0d 00 00 00 e8 15 f4 4c
RSP: 0018:ffa0000003c03bd0 EFLAGS: 00010282
RAX: 5aab7562e1de3200 RBX: ffa0000003be4000 RCX: 0000000000000018
RDX: 0000000000000000 RSI: ffa0000003be4048 RDI: 0000000000000000
RBP: ffa0000003c03bd0 R08: 000000000006043d R09: ffffffffffffffff
R10: 0000000000000000 R11: ffffffffa000096c R12: ff11000104ae5b00
R13: ff1100010610a3c0 R14: ffffffff814d34ef R15: 0000000000000000
FS: 00007fd67d550740(0000) GS:ff110005a40a9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 000000010d9ec002 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
<TASK>
? __die_body+0xaf/0xc0
? page_fault_oops+0x35b/0x3c0
? do_user_addr_fault+0x6d4/0x730
? srso_alias_return_thunk+0x5/0xfbef5
? exc_page_fault+0x5f/0xe0
? asm_exc_page_fault+0x26/0x30
? bpf_trace_run4+0xbf/0x240
? 0xffffffffa000096c
? bpf_prog_5b31430a4390397c_tcp_sendmsg_locked+0x18/0x37
bpf_trace_run4+0x14c/0x240
? trace_event_raw_event_tcp_sendmsg_locked+0xc3/0xf0
__traceiter_tcp_sendmsg_locked+0x44/0x60
tcp_sendmsg_locked+0x10c8/0x15b0
? __local_bh_enable_ip+0x166/0x1c0
? srso_alias_return_thunk+0x5/0xfbef5
tcp_sendmsg+0x2c/0x50
? __pfx_inet6_sendmsg+0x10/0x10
sock_sendmsg_nosec+0xa0/0x100
__sys_sendto+0x1b4/0x1f0
__x64_sys_sendto+0x26/0x30
do_syscall_64+0x83/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-04-08 18:06 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 13:40 [PATCH net-next v2 0/2] trace: add tracepoint to tcp_sendmsg_locked Breno Leitao
2025-04-07 13:40 ` [PATCH net-next v2 1/2] net: pass const to msg_data_left() Breno Leitao
2025-04-08 0:53 ` Kuniyuki Iwashima
2025-04-08 14:20 ` Eric Dumazet
2025-04-07 13:40 ` [PATCH net-next v2 2/2] trace: tcp: Add tracepoint for tcp_sendmsg_locked() Breno Leitao
2025-04-08 1:00 ` Kuniyuki Iwashima
2025-04-08 14:27 ` Breno Leitao
2025-04-08 15:16 ` David Ahern
2025-04-08 17:01 ` Breno Leitao
2025-04-08 17:12 ` Kuniyuki Iwashima
2025-04-08 18:06 ` Breno Leitao
2025-04-08 1:05 ` Kuniyuki Iwashima
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).