From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, dsahern@kernel.org,
willemdebruijn.kernel@gmail.com, willemb@google.com,
ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org,
netdev@vger.kernel.org, Jason Xing <kernelxing@tencent.com>
Subject: Re: [PATCH net-next v2 04/12] net-timestamp: add static key to control the whole bpf extension
Date: Wed, 16 Oct 2024 00:01:08 -0700 [thread overview]
Message-ID: <5398c020-e9b4-49d2-a5fa-dca047296ddd@linux.dev> (raw)
In-Reply-To: <CAL+tcoDyt=3hjwdx8Wk-abKg=qQsY=7UKu9=TU4iUAk5gMT2MQ@mail.gmail.com>
On 10/15/24 11:30 PM, Jason Xing wrote:
> On Wed, Oct 16, 2024 at 2:13 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 10/15/24 6:32 PM, Jason Xing wrote:
>>> On Wed, Oct 16, 2024 at 9:04 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>>>>
>>>> On Wed, Oct 16, 2024 at 8:10 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>>>>
>>>>> On 10/11/24 9:06 PM, Jason Xing wrote:
>>>>>> From: Jason Xing <kernelxing@tencent.com>
>>>>>>
>>>>>> Willem suggested that we use a static key to control. The advantage
>>>>>> is that we will not affect the existing applications at all if we
>>>>>> don't load BPF program.
>>>>>>
>>>>>> In this patch, except the static key, I also add one logic that is
>>>>>> used to test if the socket has enabled its tsflags in order to
>>>>>> support bpf logic to allow both cases to happen at the same time.
>>>>>> Or else, the skb carring related timestamp flag doesn't know which
>>>>>> way of printing is desirable.
>>>>>>
>>>>>> One thing important is this patch allows print from both applications
>>>>>> and bpf program at the same time. Now we have three kinds of print:
>>>>>> 1) only BPF program prints
>>>>>> 2) only application program prints
>>>>>> 3) both can print without side effect
>>>>>>
>>>>>> Signed-off-by: Jason Xing <kernelxing@tencent.com>
>>>>>> ---
>>>>>> include/net/sock.h | 1 +
>>>>>> net/core/filter.c | 3 +++
>>>>>> net/core/skbuff.c | 38 ++++++++++++++++++++++++++++++++++++++
>>>>>> 3 files changed, 42 insertions(+)
>>>>>>
>>>>>> diff --git a/include/net/sock.h b/include/net/sock.h
>>>>>> index 66ecd78f1dfe..b7c51b95c92d 100644
>>>>>> --- a/include/net/sock.h
>>>>>> +++ b/include/net/sock.h
>>>>>> @@ -2889,6 +2889,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif)
>>>>>> void sock_def_readable(struct sock *sk);
>>>>>>
>>>>>> int sock_bindtoindex(struct sock *sk, int ifindex, bool lock_sk);
>>>>>> +DECLARE_STATIC_KEY_FALSE(bpf_tstamp_control);
>>>>>> void sock_set_timestamp(struct sock *sk, int optname, bool valbool);
>>>>>> int sock_get_timestamping(struct so_timestamping *timestamping,
>>>>>> sockptr_t optval, unsigned int optlen);
>>>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>>>> index 996426095bd9..08135f538c99 100644
>>>>>> --- a/net/core/filter.c
>>>>>> +++ b/net/core/filter.c
>>>>>> @@ -5204,6 +5204,8 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
>>>>>> .arg1_type = ARG_PTR_TO_CTX,
>>>>>> };
>>>>>>
>>>>>> +DEFINE_STATIC_KEY_FALSE(bpf_tstamp_control);
>>>>>> +
>>>>>> static int bpf_sock_set_timestamping(struct sock *sk,
>>>>>> struct so_timestamping *timestamping)
>>>>>> {
>>>>>> @@ -5217,6 +5219,7 @@ static int bpf_sock_set_timestamping(struct sock *sk,
>>>>>> return -EINVAL;
>>>>>>
>>>>>> WRITE_ONCE(sk->sk_tsflags[BPFPROG_TS_REQUESTOR], flags);
>>>>>> + static_branch_enable(&bpf_tstamp_control);
>>>>>
>>>>> Not sure when is a good time to do static_branch_disable().
>>>>
>>>> Thanks for the review.
>>>>
>>>> To be honest, I considered how to disable the static key. Like you
>>>> said, I failed to find a good chance that I can accurately disable it.
>>>>
>>>>>
>>>>> The bpf prog may be detached also. (IF) it ends up staying with the
>>>>> cgroup/sockops interface, it should depend on the existing static key in
>>>>> cgroup_bpf_enabled(CGROUP_SOCK_OPS) instead of adding another one.
>>>>
>>>> Are you suggesting that we need to remove the current static key? In
>>>> the previous thread, the reason why Willem came up with this idea is,
>>>> I think, to avoid affect the non-bpf timestamping feature.
>>>>
>>>>>
>>>>>>
>>>>>> return 0;
>>>>>> }
>>>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>>>> index f36eb9daa31a..d0f912f1ff7b 100644
>>>>>> --- a/net/core/skbuff.c
>>>>>> +++ b/net/core/skbuff.c
>>>>>> @@ -5540,6 +5540,29 @@ void skb_complete_tx_timestamp(struct sk_buff *skb,
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(skb_complete_tx_timestamp);
>>>>>>
>>>>>> +static bool sk_tstamp_tx_flags(struct sock *sk, u32 tsflags, int tstype)
>>>>>
>>>>> sk is unused.
>>>>
>>>> Thanks for the careful check.
>>>>
>>>>>
>>>>>> +{
>>>>>> + u32 testflag;
>>>>>> +
>>>>>> + switch (tstype) {
>>>>>> + case SCM_TSTAMP_SCHED:
>>>>>
>>>>> Instead of doing this translation,
>>>>> is it easier to directly store the bpf prog desired ts"type" (i.e. the
>>>>> SCM_TSTAMP_*) in the sk->sk_tsflags_bpf?
>>>>> or there is a specific need to keep the SOF_TIMESTAMPING_* value in
>>>>> sk->sk_tsflags_bpf?
>>>>
>>>> We have to reuse SOF_TIMESTAMPING_* because there are more flags, say,
>>>> SOF_TIMESTAMPING_OPT_ID, that we need to support.
>>>>
>>>>>
>>>>>> + testflag = SOF_TIMESTAMPING_TX_SCHED;
>>>>>> + break;
>>>>>> + case SCM_TSTAMP_SND:
>>>>>> + testflag = SOF_TIMESTAMPING_TX_SOFTWARE;
>>>>>> + break;
>>>>>> + case SCM_TSTAMP_ACK:
>>>>>> + testflag = SOF_TIMESTAMPING_TX_ACK;
>>>>>> + break;
>>>>>> + default:
>>>>>> + return false;
>>>>>> + }
>>>>>> + if (tsflags & testflag)
>>>>>> + return true;
>>>>>> +
>>>>>> + return false;
>>>>>> +}
>>>>>> +
>>>>>> static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
>>>>>> const struct sk_buff *ack_skb,
>>>>>> struct skb_shared_hwtstamps *hwtstamps,
>>>>>> @@ -5558,6 +5581,9 @@ static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
>>>>>> if (!skb_may_tx_timestamp(sk, tsonly))
>>>>>> return;
>>>>>>
>>>>>> + if (!sk_tstamp_tx_flags(sk, tsflags, tstype))
>>>>>
>>>>> This is a new test. tsflags is the sk->sk_tsflags here if I read it correctly.
>>>>
>>>> This test will be used in bpf and non-bpf cases. Because of this, we
>>>> can support BPF extension. In this function, if skb has tsflags but we
>>>> don't know which approach the user expects, sk_tstamp_tx_flags() can
>>>> help us.
>>>>
>>>>>
>>>>> My understanding is the sendmsg can provide SOF_TIMESTAMPING_* for individual
>>>>> skb. Would it break?
>>>>
>>>> Oh, you're right. I didn't support cmsg mode...
>>>
>>> I think I only need to test if it's in the bpf mode, or else let the
>>> original way print the timestamp, which can solve the issue.
>>
>> From looking at the existing "__skb_tstamp_tx(skb, NULL, NULL, skb->sk,
>> SCM_TSTAMP_SCHED);":
>>
>> int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>> {
>> /* ... */
>>
>> if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
>> __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
>>
>> /* ... */
>> }
>>
>> I am still puzzling how __skb_tstamp_tx() will be called if only bpf has enabled
>> the timestamping. I may have missed somewhere in the patch set that the skb's
>> tx_flags is changed by sk->sk_tsflags_bpf alone?
>
> If sk_tsflags_bpf is set, tcp_sendmsg() -> tcp_tx_timestamp() will be
> helpful, which initializes every last skb, please see patch [10/12].
Ah. ok. It is the thing I missed. Thanks for the pointer.
>>
>> I think a skb tskey is still desired (?), so eventually we want some spaces in
>
> tskey function is optional I think. It depends whether users want to
> use it or not. It can controlled by SOF_TIMESTAMPING_OPT_ID flag.
>
>> the skb for bpf. Jakub Sitnicki (cc-ed) has presented in LPC about extending
>> skb->data_meta usage outside of xdp and tc. I think here we want to have it
>> available at the tx side to store the tx_flags and tskey but probably want them
>> at a specific place/offset at the data_meta.
>
> If we have the plan to store extra information in data_meta, I can
> give it a try:)
>
>>
>> For now, is there thing we can explore to share in the skb_shared_info?
>
> My initial thought is just to reuse these fields in skb. It can work
> without interfering one another.
After reading closer to patch 10, I am likely still missing something. How can
it tell if the tx_flags is set by the bpf or by the user space cmsg?
>
>> Can the "struct skb_shared_hwtstamps hwtstamps;" be used for the bpf tx_flags and tskey
>> only at the "tx" side? There is already another union member.
>
> tskey is always used in the tx path.
>
> hwtstamps can be used in both rx and tx cases (please see
> tcp_update_recv_tstamps() and skb_tstamp_tx()).
hmm... we only need some where to store the bpf tx_flags and bpf tskey in the
TX-ing skb. You meant the hwtstamps of a Tx-ing skb is not empty?
At skb_tstamp_tx (TX side only?), the orig_skb's hwtstamps has not been written yet?
>
>> The hwtstamps should only be needed when the NIC is done sending?
>
> In this patch, yes, hwtstamps are the records in tx path.
>
> Thanks,
> Jason
next prev parent reply other threads:[~2024-10-16 7:01 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-12 4:06 [PATCH net-next v2 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 01/12] net-timestamp: introduce socket tsflag requestors Jason Xing
2024-10-15 1:30 ` Willem de Bruijn
2024-10-15 1:50 ` Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 02/12] net-timestamp: open gate for bpf_setsockopt Jason Xing
2024-10-15 1:34 ` Willem de Bruijn
2024-10-15 2:05 ` Jason Xing
2024-10-15 21:32 ` Martin KaFai Lau
2024-10-15 21:55 ` Willem de Bruijn
2024-10-22 13:22 ` Jason Xing
2024-10-23 0:06 ` Willem de Bruijn
2024-10-23 3:49 ` Jason Xing
2024-10-16 0:45 ` Jason Xing
2024-10-15 23:54 ` Martin KaFai Lau
2024-10-16 0:49 ` Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 03/12] net-timestamp: reorganize in skb_tstamp_tx_output() Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 04/12] net-timestamp: add static key to control the whole bpf extension Jason Xing
2024-10-15 1:36 ` Willem de Bruijn
2024-10-15 2:25 ` Jason Xing
2024-10-16 0:09 ` Martin KaFai Lau
2024-10-16 1:04 ` Jason Xing
2024-10-16 1:32 ` Jason Xing
2024-10-16 6:13 ` Martin KaFai Lau
2024-10-16 6:30 ` Jason Xing
2024-10-16 7:01 ` Martin KaFai Lau [this message]
2024-10-16 7:54 ` Jason Xing
2024-10-16 8:31 ` Martin KaFai Lau
2024-10-16 10:36 ` Jason Xing
2024-10-17 0:48 ` Martin KaFai Lau
2024-10-17 2:28 ` Jason Xing
2024-10-17 20:43 ` Martin KaFai Lau
2024-10-18 2:52 ` Jason Xing
2024-10-18 3:05 ` Jason Xing
2024-10-16 6:31 ` Martin KaFai Lau
2024-10-16 6:45 ` Jason Xing
2024-10-16 13:13 ` Willem de Bruijn
2024-10-16 13:22 ` Jason Xing
2024-10-20 21:51 ` Willem de Bruijn
2024-10-21 3:21 ` Jason Xing
2024-10-21 14:49 ` Willem de Bruijn
2024-10-21 15:05 ` Jason Xing
2024-10-22 0:53 ` Martin KaFai Lau
2024-10-22 2:30 ` Jason Xing
2024-10-23 0:17 ` Willem de Bruijn
2024-10-23 2:31 ` Willem de Bruijn
2024-10-12 4:06 ` [PATCH net-next v2 05/12] net-timestamp: add bpf infrastructure to allow exposing timestamp later Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 06/12] net-timestamp: introduce TS_SCHED_OPT_CB to generate dev xmit timestamp Jason Xing
2024-10-16 1:01 ` Martin KaFai Lau
2024-10-16 1:24 ` Jason Xing
2024-10-16 5:35 ` Martin KaFai Lau
2024-10-16 6:08 ` Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 07/12] net-timestamp: introduce TS_SW_OPT_CB to generate driver timestamp Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 08/12] net-timestamp: introduce TS_ACK_OPT_CB to generate tcp acked timestamp Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 09/12] net-timestamp: add tx OPT_ID_TCP support for bpf case Jason Xing
2024-10-15 1:38 ` Willem de Bruijn
2024-10-15 2:25 ` Jason Xing
2024-10-15 2:38 ` Willem de Bruijn
2024-10-15 2:59 ` Jason Xing
2024-10-15 8:40 ` kernel test robot
2024-10-15 9:36 ` Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 10/12] net-timestamp: make bpf for tx timestamp work Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 11/12] net-timestamp: add bpf framework for rx timestamps Jason Xing
2024-10-15 1:44 ` Willem de Bruijn
2024-10-15 2:18 ` Jason Xing
2024-10-12 4:06 ` [PATCH net-next v2 12/12] net-timestamp: add bpf support for rx software/hardware timestamp Jason Xing
2024-10-12 17:48 ` [PATCH net-next v2 00/12] net-timestamp: bpf extension to equip applications transparently Willem de Bruijn
2024-10-13 3:28 ` Jason Xing
2024-10-13 3:43 ` Jason Xing
2024-10-13 6:05 ` Jason Xing
2024-10-15 1:28 ` Willem de Bruijn
2024-10-15 2:52 ` Jason Xing
2024-10-15 2:59 ` Willem de Bruijn
2024-10-15 3:02 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5398c020-e9b4-49d2-a5fa-dca047296ddd@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kerneljasonxing@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).