From: Jason Xing <kerneljasonxing@gmail.com>
To: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, dsahern@kernel.org, willemb@google.com,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
martin.lau@linux.dev, eddyz87@gmail.com, song@kernel.org,
yonghong.song@linux.dev, john.fastabend@gmail.com,
kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com,
jolsa@kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org,
Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next 0/9] net-timestamp: bpf extension to equip applications transparently
Date: Wed, 9 Oct 2024 19:12:19 +0800 [thread overview]
Message-ID: <CAL+tcoBL22WsUbooOv6XXcGGugNyogiDhOpszGR_yj-pCdvCkA@mail.gmail.com> (raw)
In-Reply-To: <6b10ed31-c53f-4f99-9c23-e1ba34aa0905@linux.dev>
On Wed, Oct 9, 2024 at 5:28 PM Vadim Fedorenko
<vadim.fedorenko@linux.dev> wrote:
>
> On 09/10/2024 02:05, Jason Xing wrote:
> > On Wed, Oct 9, 2024 at 7:22 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >>
> >> On Wed, Oct 9, 2024 at 2:44 AM Willem de Bruijn
> >> <willemdebruijn.kernel@gmail.com> wrote:
> >>>
> >>> Jason Xing wrote:
> >>>> From: Jason Xing <kernelxing@tencent.com>
> >>>>
> >>>> A few weeks ago, I planned to extend SO_TIMESTMAMPING feature by using
> >>>> tracepoint to print information (say, tstamp) so that we can
> >>>> transparently equip applications with this feature and require no
> >>>> modification in user side.
> >>>>
> >>>> Later, we discussed at netconf and agreed that we can use bpf for better
> >>>> extension, which is mainly suggested by John Fastabend and Willem de
> >>>> Bruijn. Many thanks here! So I post this series to see if we have a
> >>>> better solution to extend.
> >>>>
> >>>> This approach relies on existing SO_TIMESTAMPING feature, for tx path,
> >>>> users only needs to pass certain flags through bpf program to make sure
> >>>> the last skb from each sendmsg() has timestamp related controlled flag.
> >>>> For rx path, we have to use bpf_setsockopt() to set the sk->sk_tsflags
> >>>> and wait for the moment when recvmsg() is called.
> >>>
> >>> As you mention, overall I am very supportive of having a way to add
> >>> timestamping by adminstrators, without having to rebuild applications.
> >>> BPF hooks seem to be the right place for this.
> >>>
> >>> There is existing kprobe/kretprobe/kfunc support. Supporting
> >>> SO_TIMESTAMPING directly may be useful due to its targeted feature
> >>> set, and correlation between measurements for the same data in the
> >>> stream.
> >>>
> >>>> After this series, we could step by step implement more advanced
> >>>> functions/flags already in SO_TIMESTAMPING feature for bpf extension.
> >>>
> >>> My main implementation concern is where this API overlaps with the
> >>> existing user API, and how they might conflict. A few questions in the
> >>> patches.
> >>
> >> Agreed. That's also what I'm concerned about. So I decided to ask for
> >> related experts' help.
> >>
> >> How to deal with it without interfering with the existing apps in the
> >> right way is the key problem.
> >
> > What I try to implement is let the bpf program have the highest
> > precedence. It's similar to RTO min, see the commit as an example:
> >
> > commit f086edef71be7174a16c1ed67ac65a085cda28b1
> > Author: Kevin Yang <yyd@google.com>
> > Date: Mon Jun 3 21:30:54 2024 +0000
> >
> > tcp: add sysctl_tcp_rto_min_us
> >
> > Adding a sysctl knob to allow user to specify a default
> > rto_min at socket init time, other than using the hard
> > coded 200ms default rto_min.
> >
> > Note that the rto_min route option has the highest precedence
> > for configuring this setting, followed by the TCP_BPF_RTO_MIN
> > socket option, followed by the tcp_rto_min_us sysctl.
> >
> > It includes three cases, 1) route option, 2) bpf option, 3) sysctl.
> > The first priority can override others. It doesn't have a good
> > chance/point to restore the icsk_rto_min field if users want to
> > shutdown the bpf program because it is set in
> > bpf_sol_tcp_setsockopt().
>
> rto_min example is slightly different. With tcp_rto_min the doesn't
> expect any data to come back to user space while for timestamping the
> app may be confused directly by providing more data, or by not providing
> expected data. I believe some hint about requestor of the data is needed
> here. It will also help to solve the problem of populating sk_err_queue
> mentioned by Martin.
Sorry, I don't fully get it. In this patch series, this bpf extension
feature will not rely on sk_err_queue any more to report tx timestamps
to userspace. Bpf program can do that printing.
Do you mean that it could be wrong if one skb carries the tsflags that
are previously set due to the bpf program and then suddenly users
detach the program? It indeed will put a new/cloned skb into the error
queue. Interesting corner case. It seems I have to re-implement a
totally independent tsflags for bpf extension feature. Do you have a
better idea on this?
Thanks,
Jason
next prev parent reply other threads:[~2024-10-09 11:12 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-08 9:51 [PATCH net-next 0/9] net-timestamp: bpf extension to equip applications transparently Jason Xing
2024-10-08 9:51 ` [PATCH net-next 1/9] net-timestamp: add bpf infrastructure to allow exposing more information later Jason Xing
2024-10-08 18:45 ` Willem de Bruijn
2024-10-08 23:27 ` Jason Xing
2024-10-09 13:22 ` Willem de Bruijn
2024-10-09 13:57 ` Jason Xing
2024-10-09 0:58 ` Kuniyuki Iwashima
2024-10-09 8:11 ` Jason Xing
2024-10-08 9:51 ` [PATCH net-next 2/9] net-timestamp: introduce TS_SCHED_OPT_CB to generate dev xmit timestamp Jason Xing
2024-10-08 9:51 ` [PATCH net-next 3/9] net-timestamp: introduce TS_SW_OPT_CB to generate driver timestamp Jason Xing
2024-10-08 19:13 ` Vadim Fedorenko
2024-10-08 23:08 ` Jason Xing
2024-10-08 9:51 ` [PATCH net-next 4/9] net-timestamp: introduce TS_ACK_OPT_CB to generate tcp acked timestamp Jason Xing
2024-10-08 9:51 ` [PATCH net-next 5/9] net-timestamp: ready to turn on the button to generate tx timestamps Jason Xing
2024-10-08 18:53 ` Willem de Bruijn
2024-10-08 23:37 ` Jason Xing
2024-10-08 19:18 ` Vadim Fedorenko
2024-10-08 23:48 ` Jason Xing
2024-10-09 9:16 ` Vadim Fedorenko
2024-10-09 11:15 ` Jason Xing
2024-10-08 9:51 ` [PATCH net-next 6/9] net-timestamp: add tx OPT_ID_TCP support for bpf case Jason Xing
2024-10-08 18:56 ` Willem de Bruijn
2024-10-08 23:18 ` Jason Xing
2024-10-09 13:19 ` Willem de Bruijn
2024-10-09 13:52 ` Jason Xing
2024-10-08 9:51 ` [PATCH net-next 7/9] net-timestamp: open gate for bpf_setsockopt Jason Xing
2024-10-09 7:19 ` Martin KaFai Lau
2024-10-09 8:09 ` Jason Xing
2024-10-09 13:23 ` Willem de Bruijn
2024-10-09 13:48 ` Jason Xing
2024-10-08 9:51 ` [PATCH net-next 8/9] net-timestamp: add bpf framework for rx timestamps Jason Xing
2024-10-09 0:22 ` Jakub Kicinski
2024-10-09 0:30 ` Jason Xing
2024-10-09 2:33 ` kernel test robot
2024-10-09 4:17 ` kernel test robot
2024-10-09 5:09 ` kernel test robot
2024-10-08 9:51 ` [PATCH net-next 9/9] net-timestamp: add bpf support for rx software/hardware timestamp Jason Xing
2024-10-08 18:44 ` [PATCH net-next 0/9] net-timestamp: bpf extension to equip applications transparently Willem de Bruijn
2024-10-08 23:22 ` Jason Xing
2024-10-09 1:05 ` Jason Xing
2024-10-09 9:27 ` Vadim Fedorenko
2024-10-09 11:12 ` Jason Xing [this message]
2024-10-09 11:48 ` Jason Xing
2024-10-09 13:16 ` Vadim Fedorenko
2024-10-09 13:47 ` Jason Xing
2024-10-09 13:58 ` Vadim Fedorenko
2024-10-09 14:35 ` Jason Xing
2024-10-09 14:59 ` Vadim Fedorenko
2024-10-09 15:20 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAL+tcoBL22WsUbooOv6XXcGGugNyogiDhOpszGR_yj-pCdvCkA@mail.gmail.com \
--to=kerneljasonxing@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernelxing@tencent.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=vadim.fedorenko@linux.dev \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).