From: Yonghong Song <yonghong.song@linux.dev>
To: Breno Leitao <leitao@debian.org>, Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Xing <kerneljasonxing@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
"David S. Miller" <davem@davemloft.net>,
David Ahern <dsahern@kernel.org>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, kernel-team@meta.com,
Song Liu <song@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH RFC net-next] trace: tcp: Add tracepoint for tcp_cwnd_reduction()
Date: Wed, 22 Jan 2025 10:40:20 -0800 [thread overview]
Message-ID: <be0f5d0d-479e-46a3-9e9c-ebcd0b1987e9@linux.dev> (raw)
In-Reply-To: <20250122-vengeful-myna-of-tranquility-f0f8cf@leitao>
On 1/22/25 1:39 AM, Breno Leitao wrote:
> Hello Steven,
>
> On Mon, Jan 20, 2025 at 10:03:40AM -0500, Steven Rostedt wrote:
>> On Mon, 20 Jan 2025 05:20:05 -0800
>> Breno Leitao <leitao@debian.org> wrote:
>>
>>> This patch enhances the API's stability by introducing a guaranteed hook
>>> point, allowing the compiler to make changes without disrupting the
>>> BPF program's functionality.
>> Instead of using a TRACE_EVENT() macro, you can use DECLARE_TRACE()
>> which will create the tracepoint in the kernel, but will not create a
>> trace event that is exported to the tracefs file system. Then BPF could
>> hook to it and it will still not be exposed as an user space API.
> Right, DECLARE_TRACE would solve my current problem, but, a056a5bed7fa
> ("sched/debug: Export the newly added tracepoints") says "BPF doesn't
> have infrastructure to access these bare tracepoints either.".
>
> Does BPF know how to attach to this bare tracepointers now?
>
> On the other side, it seems real tracepoints is getting more pervasive?
> So, this current approach might be OK also?
>
> https://lore.kernel.org/bpf/20250118033723.GV1977892@ZenIV/T/#m4c2fb2d904e839b34800daf8578dff0b9abd69a0
>
>> You can see its use in include/trace/events/sched.h
> I suppose I need to export the tracepointer with
> EXPORT_TRACEPOINT_SYMBOL_GPL(), right?
>
> I am trying to hack something as the following, but, I struggled to hook
> BPF into it.
>
> Thank you!
> --breno
>
> Author: Breno Leitao <leitao@debian.org>
> Date: Fri Jan 17 09:26:22 2025 -0800
>
> trace: tcp: Add tracepoint for tcp_cwnd_reduction()
>
> Add a lightweight tracepoint to monitor TCP congestion window
> adjustments via tcp_cwnd_reduction(). This tracepoint enables tracking
> of:
> - TCP window size fluctuations
> - Active socket behavior
> - Congestion window reduction events
>
> Meta has been using BPF programs to monitor this function for years.
> Adding a proper tracepoint provides a stable API for all users who need
> to monitor TCP congestion window behavior.
>
> Use DECLARE_TRACE instead of TRACE_EVENT to avoid creating trace event
> infrastructure and exporting to tracefs, keeping the implementation
> minimal. (Thanks Steven Rostedt)
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index a27c4b619dffd..07add3e20931a 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -259,6 +259,11 @@ TRACE_EVENT(tcp_retransmit_synack,
> __entry->saddr_v6, __entry->daddr_v6)
> );
>
> +DECLARE_TRACE(tcp_cwnd_reduction_tp,
> + TP_PROTO(const struct sock *sk, const int newly_acked_sacked,
> + const int newly_lost, const int flag),
I don't think we need 'const' for int types. For 'const strcut sock *',
it makes sense since we do not want sk-><fields> get changed.
> + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag));
> +
> #include <trace/events/net_probe_common.h>
>
> TRACE_EVENT(tcp_probe,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4811727b8a022..74cf8dbbedaa0 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
> if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd))
> return;
>
> + trace_tcp_cwnd_reduction_tp(sk, newly_acked_sacked, newly_lost, flag);
> +
> tp->prr_delivered += newly_acked_sacked;
> if (delta < 0) {
> u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
> @@ -2726,6 +2728,7 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
> sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1));
> tcp_snd_cwnd_set(tp, tcp_packets_in_flight(tp) + sndcnt);
> }
> +EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_cwnd_reduction_tp);
>
> static inline void tcp_end_cwnd_reduction(struct sock *sk)
> {
next prev parent reply other threads:[~2025-01-22 18:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-20 12:02 [PATCH RFC net-next] trace: tcp: Add tracepoint for tcp_cwnd_reduction() Breno Leitao
2025-01-20 12:08 ` Jason Xing
2025-01-20 13:02 ` Breno Leitao
2025-01-20 13:06 ` Jason Xing
2025-01-20 13:20 ` Breno Leitao
2025-01-20 15:03 ` Steven Rostedt
2025-01-22 9:39 ` Breno Leitao
2025-01-22 14:56 ` Steven Rostedt
2025-01-22 19:02 ` Yonghong Song
2025-01-24 4:40 ` Yonghong Song
2025-01-24 15:50 ` Steven Rostedt
2025-01-24 17:35 ` Yonghong Song
2025-01-22 18:40 ` Yonghong Song [this message]
2025-01-21 1:15 ` Jason Xing
2025-01-21 1:22 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=be0f5d0d-479e-46a3-9e9c-ebcd0b1987e9@linux.dev \
--to=yonghong.song@linux.dev \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kernel-team@meta.com \
--cc=kerneljasonxing@gmail.com \
--cc=kuba@kernel.org \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=martin.lau@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rostedt@goodmis.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).