From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, dsahern@kernel.org,
willemdebruijn.kernel@gmail.com, willemb@google.com,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v9 11/12] bpf: support selective sampling for bpf timestamping
Date: Mon, 10 Feb 2025 23:41:49 -0800 [thread overview]
Message-ID: <e419521b-c38e-41e0-a4da-93dcbb820486@linux.dev> (raw)
In-Reply-To: <20250208103220.72294-12-kerneljasonxing@gmail.com>
On 2/8/25 2:32 AM, Jason Xing wrote:
> Use __bpf_kfunc feature to allow bpf prog dynamically and selectively
s/Use/Add/
Remove "dynamically". A kfunc can only be called dynamically at runtime.
Like:
"Add the bpf_sock_ops_enable_tx_tstamp kfunc to allow BPF programs to
selectively enable TX timestamping on a skb during tcp_sendmsg..."
> to sample/track the skb. For example, the bpf prog will limit tracking
> X numbers of packets and then will stop there instead of tracing
> all the sendmsgs of matched flow all along.
> > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
> kernel/bpf/btf.c | 1 +
> net/core/filter.c | 27 ++++++++++++++++++++++++++-
> 2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 8396ce1d0fba..a65e2eeffb88 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -8535,6 +8535,7 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
> case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
> case BPF_PROG_TYPE_CGROUP_SOCKOPT:
> case BPF_PROG_TYPE_CGROUP_SYSCTL:
> + case BPF_PROG_TYPE_SOCK_OPS:
> return BTF_KFUNC_HOOK_CGROUP;
> case BPF_PROG_TYPE_SCHED_ACT:
> return BTF_KFUNC_HOOK_SCHED_ACT;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7f56d0bbeb00..db20a947e757 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -12102,6 +12102,21 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk,
> #endif
> }
>
> +__bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops)
I am ok to always enable txstamp_ack here. Please still add a second "u64 flags"
argument such that future disable/enable is still possible.
> +{
> + struct sk_buff *skb;
> +
> + if (skops->op != BPF_SOCK_OPS_TS_SND_CB)
> + return -EOPNOTSUPP;> +
> + skb = skops->skb;
> + TCP_SKB_CB(skb)->txstamp_ack = 2;
Willem (thanks!) has already mentioned there is a bug.
This also brought up that a test is missing: the bpf timestamping and user
space's SO_TIMESTAMPING can work without interfering others. The current test
only has SK_BPF_CB_TX_TIMESTAMPING on. A test is needed when both
SK_BPF_CB_TX_TIMESTAMPING and the user space's SO_TIMESTAMPING are on. The
expectation is both of them will work together.
> + skb_shinfo(skb)->tx_flags |= SKBTX_BPF;
> + skb_shinfo(skb)->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
> +
> + return 0;
> +}
> +
> __bpf_kfunc_end_defs();
>
> int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
> @@ -12135,6 +12150,10 @@ BTF_KFUNCS_START(bpf_kfunc_check_set_tcp_reqsk)
> BTF_ID_FLAGS(func, bpf_sk_assign_tcp_reqsk, KF_TRUSTED_ARGS)
> BTF_KFUNCS_END(bpf_kfunc_check_set_tcp_reqsk)
>
> +BTF_KFUNCS_START(bpf_kfunc_check_set_sock_ops)
> +BTF_ID_FLAGS(func, bpf_sock_ops_enable_tx_tstamp, KF_TRUSTED_ARGS)
> +BTF_KFUNCS_END(bpf_kfunc_check_set_sock_ops)
> +
> static const struct btf_kfunc_id_set bpf_kfunc_set_skb = {
> .owner = THIS_MODULE,
> .set = &bpf_kfunc_check_set_skb,
> @@ -12155,6 +12174,11 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_tcp_reqsk = {
> .set = &bpf_kfunc_check_set_tcp_reqsk,
> };
>
> +static const struct btf_kfunc_id_set bpf_kfunc_set_sock_ops = {
> + .owner = THIS_MODULE,
> + .set = &bpf_kfunc_check_set_sock_ops,
> +};
> +
> static int __init bpf_kfunc_init(void)
> {
> int ret;
> @@ -12173,7 +12197,8 @@ static int __init bpf_kfunc_init(void)
> ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp);
> ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR,
> &bpf_kfunc_set_sock_addr);
> - return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk);
> + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk);
> + return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SOCK_OPS, &bpf_kfunc_set_sock_ops);
> }
> late_initcall(bpf_kfunc_init);
>
next prev parent reply other threads:[~2025-02-11 7:42 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-08 10:32 [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-11 1:02 ` Martin KaFai Lau
2025-02-11 2:24 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-11 1:31 ` Martin KaFai Lau
2025-02-11 2:25 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-11 6:34 ` Martin KaFai Lau
2025-02-11 8:08 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-11 6:55 ` Martin KaFai Lau
2025-02-11 8:24 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-11 7:12 ` Martin KaFai Lau
2025-02-11 7:31 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 08/12] bpf: support hw " Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-08 17:54 ` Willem de Bruijn
2025-02-08 23:27 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 10/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 11/12] bpf: support selective sampling for bpf timestamping Jason Xing
2025-02-11 7:41 ` Martin KaFai Lau [this message]
2025-02-11 7:48 ` Jason Xing
2025-02-08 10:32 ` [PATCH bpf-next v9 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-11 8:05 ` Martin KaFai Lau
2025-02-11 11:37 ` Jason Xing
2025-02-10 23:37 ` [PATCH bpf-next v9 00/12] net-timestamp: bpf extension to equip applications transparently Martin KaFai Lau
2025-02-11 0:03 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e419521b-c38e-41e0-a4da-93dcbb820486@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.