linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Breno Leitao <leitao@debian.org>, Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Xing <kerneljasonxing@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	"David S. Miller" <davem@davemloft.net>,
	David Ahern <dsahern@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org, kernel-team@meta.com,
	Song Liu <song@kernel.org>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH RFC net-next] trace: tcp: Add tracepoint for tcp_cwnd_reduction()
Date: Wed, 22 Jan 2025 10:40:20 -0800	[thread overview]
Message-ID: <be0f5d0d-479e-46a3-9e9c-ebcd0b1987e9@linux.dev> (raw)
In-Reply-To: <20250122-vengeful-myna-of-tranquility-f0f8cf@leitao>




On 1/22/25 1:39 AM, Breno Leitao wrote:
> Hello Steven,
>
> On Mon, Jan 20, 2025 at 10:03:40AM -0500, Steven Rostedt wrote:
>> On Mon, 20 Jan 2025 05:20:05 -0800
>> Breno Leitao <leitao@debian.org> wrote:
>>
>>> This patch enhances the API's stability by introducing a guaranteed hook
>>> point, allowing the compiler to make changes without disrupting the
>>> BPF program's functionality.
>> Instead of using a TRACE_EVENT() macro, you can use DECLARE_TRACE()
>> which will create the tracepoint in the kernel, but will not create a
>> trace event that is exported to the tracefs file system. Then BPF could
>> hook to it and it will still not be exposed as an user space API.
> Right, DECLARE_TRACE would solve my current problem, but, a056a5bed7fa
> ("sched/debug: Export the newly added tracepoints") says "BPF doesn't
> have infrastructure to access these bare tracepoints either.".
>
> Does BPF know how to attach to this bare tracepointers now?
>
> On the other side, it seems real tracepoints is getting more pervasive?
> So, this current approach might be OK also?
>
> 	https://lore.kernel.org/bpf/20250118033723.GV1977892@ZenIV/T/#m4c2fb2d904e839b34800daf8578dff0b9abd69a0
>
>> You can see its use in include/trace/events/sched.h
> I suppose I need to export the tracepointer with
> EXPORT_TRACEPOINT_SYMBOL_GPL(), right?
>
> I am trying to hack something as the following, but, I struggled to hook
> BPF into it.
>
> Thank you!
> --breno
>
> Author: Breno Leitao <leitao@debian.org>
> Date:   Fri Jan 17 09:26:22 2025 -0800
>
>      trace: tcp: Add tracepoint for tcp_cwnd_reduction()
>      
>      Add a lightweight tracepoint to monitor TCP congestion window
>      adjustments via tcp_cwnd_reduction(). This tracepoint enables tracking
>      of:
>        - TCP window size fluctuations
>        - Active socket behavior
>        - Congestion window reduction events
>      
>      Meta has been using BPF programs to monitor this function for years.
>      Adding a proper tracepoint provides a stable API for all users who need
>      to monitor TCP congestion window behavior.
>      
>      Use DECLARE_TRACE instead of TRACE_EVENT to avoid creating trace event
>      infrastructure and exporting to tracefs, keeping the implementation
>      minimal. (Thanks Steven Rostedt)
>
>      Signed-off-by: Breno Leitao <leitao@debian.org>
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index a27c4b619dffd..07add3e20931a 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -259,6 +259,11 @@ TRACE_EVENT(tcp_retransmit_synack,
>   		  __entry->saddr_v6, __entry->daddr_v6)
>   );
>   
> +DECLARE_TRACE(tcp_cwnd_reduction_tp,
> +	TP_PROTO(const struct sock *sk, const int newly_acked_sacked,
> +		 const int newly_lost, const int flag),

I don't think we need 'const' for int types. For 'const strcut sock *',
it makes sense since we do not want sk-><fields> get changed.

> +	TP_ARGS(sk, newly_acked_sacked, newly_lost, flag));
> +
>   #include <trace/events/net_probe_common.h>
>   
>   TRACE_EVENT(tcp_probe,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4811727b8a022..74cf8dbbedaa0 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
>   	if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd))
>   		return;
>   
> +	trace_tcp_cwnd_reduction_tp(sk, newly_acked_sacked, newly_lost, flag);
> +
>   	tp->prr_delivered += newly_acked_sacked;
>   	if (delta < 0) {
>   		u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
> @@ -2726,6 +2728,7 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
>   	sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1));
>   	tcp_snd_cwnd_set(tp, tcp_packets_in_flight(tp) + sndcnt);
>   }
> +EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_cwnd_reduction_tp);
>   
>   static inline void tcp_end_cwnd_reduction(struct sock *sk)
>   {


  parent reply	other threads:[~2025-01-22 18:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-20 12:02 [PATCH RFC net-next] trace: tcp: Add tracepoint for tcp_cwnd_reduction() Breno Leitao
2025-01-20 12:08 ` Jason Xing
2025-01-20 13:02   ` Breno Leitao
2025-01-20 13:06     ` Jason Xing
2025-01-20 13:20       ` Breno Leitao
2025-01-20 15:03         ` Steven Rostedt
2025-01-22  9:39           ` Breno Leitao
2025-01-22 14:56             ` Steven Rostedt
2025-01-22 19:02               ` Yonghong Song
2025-01-24  4:40               ` Yonghong Song
2025-01-24 15:50                 ` Steven Rostedt
2025-01-24 17:35                   ` Yonghong Song
2025-01-22 18:40             ` Yonghong Song [this message]
2025-01-21  1:15         ` Jason Xing
2025-01-21  1:22           ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be0f5d0d-479e-46a3-9e9c-ebcd0b1987e9@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kernel-team@meta.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=leitao@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).