Re: [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
	dsahern@kernel.org, willemb@google.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, eddyz87@gmail.com,
	song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work
Date: Wed, 5 Feb 2025 22:12:29 -0800	[thread overview]
Message-ID: <b158a837-d46c-4ae0-8130-7aa288422182@linux.dev> (raw)
In-Reply-To: <CAL+tcoC_5106onp6yQh-dKnCTLtEr73EZVC31T_YeMtqbZ5KBw@mail.gmail.com>

On 2/5/25 7:41 PM, Jason Xing wrote:
> On Thu, Feb 6, 2025 at 11:25 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>
>>>>> I think we can split the whole idea into two parts: for now, because
>>>>> of the current series implementing the same function as SO_TIMETAMPING
>>>>> does, I will implement the selective sample feature in the series.
>>>>> After someday we finish tracing all the skb, then we will add the
>>>>> corresponding selective sample feature.
>>>>
>>>> Are you saying that you will include selective sampling now or want to
>>>> postpone it?
>>>
>>> A few months ago, I planned to do it after this series. Since you all
>>> ask, it's not complex to have it included in this series :)
>>>
>>> Selective sampling has two kinds of meaning like I mentioned above, so
>>> in the next re-spin I will implement the cmsg feature for bpf
>>> extension in this series.
>>
>> Great thanks.
> 
> I have to rephrase a bit in case Martin visits here soon: I will
> compare two approaches 1) reply value, 2) bpf kfunc and then see which
> way is better.

I have already explained in details why the 1) reply value from the bpf prog 
won't work. Please go back to that reply which has the context.

> 
>>
>>> I'm doing the test right now. And leave
>>> another selective sampling small feature until the feature of tracing
>>> all the skbs is implemented if possible.
>>
>> Can you elaborate on this other feature?
> 
> Do you recall oneday I asked your opinion privately about whether we
> can trace _all the skbs_ (not the last skb from each sendmsg) to have
> a better insight of kernel behaviour? I can also see a couple of
> latency issues in the kernel. If it is approved, then corresponding
> selective sampling should be supported. It's what I was trying to
> describe.
> 
> The advantage of relying on the timestamping feature is that we can
> isolate normal flows and monitored flow so that normal flows wouldn't
> be affected because of enabling the monitoring feature, compared to so
> many open source monitoring applications I've dug into. They usually
> directly hook the hot path like __tcp_transmit_skb() or
> dev_queue_xmit, which will surely influence the normal flows and cause
> performance degradation to some extent. I noticed that after
> conducting some tests a few months ago. The principle behind the bpf
> fentry is to replace some instructions at the very beginning of the
> hooked function, so every time even normal flows entering the
> monitored function will get affected.

I sort of guess this while stalled in the traffic... :/

I was not asking to be able to "selective on all skb of a large msg". This will 
be a separate topic. If we really wanted to support this case (tbh, I am not 
convinced) in the future, there is more reason the default behavior should be 
"off" now for consistency reason.

The comment was on the existing tcp_tx_timestamp(). First focus on allowing 
selective tracking of the skb that the current tcp_tx_timestamp() also tracks 
because it is the most understood use case. This will allow the bpf prog to 
select which tcp_sendmsg call it should track/sample. Perhaps the bpf prog will 
limit tracking X numbers of packets and then will stop there. Perhaps the bpf 
prog will only allocate X numbers of sample spaces in the bpf_sk_storage to 
track packet. There are many reasons that bpf prog may want to sample and stop 
tracking at some point even in the current tcp_tx_timestamp().

next prev parent reply	other threads:[~2025-02-06  6:12 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-04 18:30 [PATCH bpf-next v8 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-05 15:22   ` Willem de Bruijn
2025-02-05 15:34     ` Jason Xing
2025-02-05 20:57       ` Martin KaFai Lau
2025-02-05 21:25       ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-05 15:24   ` Willem de Bruijn
2025-02-05 15:35     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-05 15:26   ` Willem de Bruijn
2025-02-05 15:50     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-05  1:47   ` Jakub Kicinski
2025-02-05  2:40     ` Jason Xing
2025-02-05  3:14       ` Jakub Kicinski
2025-02-05  3:23         ` Jason Xing
2025-02-05  1:50   ` Jakub Kicinski
2025-02-05 15:34   ` Willem de Bruijn
2025-02-05 15:52     ` Jason Xing
2025-02-06  8:43     ` Jason Xing
2025-02-06 10:22       ` Jason Xing
2025-02-06 16:13       ` Willem de Bruijn
2025-02-07  0:22         ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-05 15:36   ` Willem de Bruijn
2025-02-05 15:55     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 08/12] bpf: support hw " Jason Xing
2025-02-05 15:45   ` Willem de Bruijn
2025-02-05 16:03     ` Jason Xing
2025-02-10 22:39       ` Martin KaFai Lau
2025-02-11  0:00         ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-05 15:47   ` Willem de Bruijn
2025-02-05 16:06     ` Jason Xing
2025-02-05 21:25       ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work Jason Xing
2025-02-05  1:57   ` Jakub Kicinski
2025-02-05  2:15     ` Jason Xing
2025-02-05 21:57     ` Martin KaFai Lau
2025-02-06  0:12       ` Jason Xing
2025-02-06  0:42         ` Jason Xing
2025-02-06  0:47         ` Martin KaFai Lau
2025-02-06  1:05           ` Jason Xing
2025-02-06  2:39             ` Jason Xing
2025-02-06  2:56               ` Willem de Bruijn
2025-02-06  3:09                 ` Jason Xing
2025-02-06  3:25                   ` Willem de Bruijn
2025-02-06  3:41                     ` Jason Xing
2025-02-06  6:12                       ` Martin KaFai Lau [this message]
2025-02-06  6:56                         ` Jason Xing
2025-02-07  2:07                           ` Martin KaFai Lau
2025-02-07  2:18                             ` Jason Xing
2025-02-07 12:07                               ` Jason Xing
2025-02-08  2:11                                 ` Martin KaFai Lau
2025-02-08  6:53                                   ` Jason Xing
2025-02-07 13:34                             ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-05  5:28   ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-05 15:54   ` Willem de Bruijn
2025-02-05 16:08     ` Jason Xing
2025-02-06  1:28       ` Martin KaFai Lau
2025-02-06  2:14         ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b158a837-d46c-4ae0-8130-7aa288422182@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.