netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, dsahern@kernel.org, willemb@google.com,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt()
Date: Wed, 5 Feb 2025 12:57:58 -0800	[thread overview]
Message-ID: <0f3159c8-d2fe-4fc0-8f09-260aa84e12bd@linux.dev> (raw)
In-Reply-To: <CAL+tcoA26mQQf-_0Ebi2qqd=qVn1soUoma-3o8NdUFGZL_-L2g@mail.gmail.com>

On 2/5/25 7:34 AM, Jason Xing wrote:
> On Wed, Feb 5, 2025 at 11:22 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>
>> Jason Xing wrote:
>>> Users can write the following code to enable the bpf extension:
>>> int flags = SK_BPF_CB_TX_TIMESTAMPING;
>>> int opts = SK_BPF_CB_FLAGS;
>>> bpf_setsockopt(skops, SOL_SOCKET, opts, &flags, sizeof(flags));
>>>
>>> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
>>> ---
>>>   include/net/sock.h             |  3 +++
>>>   include/uapi/linux/bpf.h       |  8 ++++++++
>>>   net/core/filter.c              | 23 +++++++++++++++++++++++
>>>   tools/include/uapi/linux/bpf.h |  1 +
>>>   4 files changed, 35 insertions(+)
>>>
>>> diff --git a/include/net/sock.h b/include/net/sock.h
>>> index 8036b3b79cd8..7916982343c6 100644
>>> --- a/include/net/sock.h
>>> +++ b/include/net/sock.h
>>> @@ -303,6 +303,7 @@ struct sk_filter;
>>>     *  @sk_stamp: time stamp of last packet received
>>>     *  @sk_stamp_seq: lock for accessing sk_stamp on 32 bit architectures only
>>>     *  @sk_tsflags: SO_TIMESTAMPING flags
>>> +  *  @sk_bpf_cb_flags: used in bpf_setsockopt()
>>>     *  @sk_use_task_frag: allow sk_page_frag() to use current->task_frag.
>>>     *                     Sockets that can be used under memory reclaim should
>>>     *                     set this to false.
>>> @@ -445,6 +446,8 @@ struct sock {
>>>        u32                     sk_reserved_mem;
>>>        int                     sk_forward_alloc;
>>>        u32                     sk_tsflags;
>>> +#define SK_BPF_CB_FLAG_TEST(SK, FLAG) ((SK)->sk_bpf_cb_flags & (FLAG))
>>> +     u32                     sk_bpf_cb_flags;
>>>        __cacheline_group_end(sock_write_rxtx);
>>>
>>>        __cacheline_group_begin(sock_write_tx);
>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>>> index 2acf9b336371..6116eb3d1515 100644
>>> --- a/include/uapi/linux/bpf.h
>>> +++ b/include/uapi/linux/bpf.h
>>> @@ -6913,6 +6913,13 @@ enum {
>>>        BPF_SOCK_OPS_ALL_CB_FLAGS       = 0x7F,
>>>   };
>>>
>>> +/* Definitions for bpf_sk_cb_flags */
>>> +enum {
>>> +     SK_BPF_CB_TX_TIMESTAMPING       = 1<<0,
>>> +     SK_BPF_CB_MASK                  = (SK_BPF_CB_TX_TIMESTAMPING - 1) |
>>> +                                        SK_BPF_CB_TX_TIMESTAMPING
>>> +};
>>> +
>>>   /* List of known BPF sock_ops operators.
>>>    * New entries can only be added at the end
>>>    */
>>> @@ -7091,6 +7098,7 @@ enum {
>>>        TCP_BPF_SYN_IP          = 1006, /* Copy the IP[46] and TCP header */
>>>        TCP_BPF_SYN_MAC         = 1007, /* Copy the MAC, IP[46], and TCP header */
>>>        TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
>>> +     SK_BPF_CB_FLAGS         = 1009, /* Used to set socket bpf flags */
>>>   };
>>>
>>>   enum {
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 2ec162dd83c4..1c6c07507a78 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -5222,6 +5222,25 @@ static const struct bpf_func_proto bpf_get_socket_uid_proto = {
>>>        .arg1_type      = ARG_PTR_TO_CTX,
>>>   };
>>>
>>> +static int sk_bpf_set_get_cb_flags(struct sock *sk, char *optval, bool getopt)
>>> +{
>>> +     u32 sk_bpf_cb_flags;
>>> +
>>> +     if (getopt) {
>>> +             *(u32 *)optval = sk->sk_bpf_cb_flags;
>>> +             return 0;
>>> +     }
>>> +
>>> +     sk_bpf_cb_flags = *(u32 *)optval;
>>> +
>>> +     if (sk_bpf_cb_flags & ~SK_BPF_CB_MASK)
>>> +             return -EINVAL;
>>> +
>>> +     sk->sk_bpf_cb_flags = sk_bpf_cb_flags;
>>
>> I don't know BPF internals that well:
>>
>> Is there mutual exclusion between these sol_socket_sockopt calls?

Yep. There is a sock_owned_by_me() in the earlier code path of sol_socket_sockopt.

Another reader is at patch 11 tcp_tx_timestamp() which should have owned the 
sock also.

>> Or do these sk field accesses need WRITE_ONCE/READ_ONCE.

> this potential data race issue, just in case bpf program doesn't use
> it as we expect, I think I will add the this annotation in v9.

Jason, it should not have data race issue in sol_socket_sockopt(). bpf program 
cannot use the sol_socket_sockopt without a lock. Patch 4 was added exactly to 
ensure that.

The situation is similar to the existing tcp_sk(sk)->bpf_sock_ops_cb_flags. It 
is also a plain access such that it is clear all read/write is under the 
sock_owned_by_me.

  reply	other threads:[~2025-02-05 20:58 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-04 18:30 [PATCH bpf-next v8 00/12] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 01/12] bpf: add support for bpf_setsockopt() Jason Xing
2025-02-05 15:22   ` Willem de Bruijn
2025-02-05 15:34     ` Jason Xing
2025-02-05 20:57       ` Martin KaFai Lau [this message]
2025-02-05 21:25       ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 02/12] bpf: prepare for timestamping callbacks use Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 03/12] bpf: stop unsafely accessing TCP fields in bpf callbacks Jason Xing
2025-02-05 15:24   ` Willem de Bruijn
2025-02-05 15:35     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 04/12] bpf: stop calling some sock_op BPF CALLs in new timestamping callbacks Jason Xing
2025-02-05 15:26   ` Willem de Bruijn
2025-02-05 15:50     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 05/12] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-02-05  1:47   ` Jakub Kicinski
2025-02-05  2:40     ` Jason Xing
2025-02-05  3:14       ` Jakub Kicinski
2025-02-05  3:23         ` Jason Xing
2025-02-05  1:50   ` Jakub Kicinski
2025-02-05 15:34   ` Willem de Bruijn
2025-02-05 15:52     ` Jason Xing
2025-02-06  8:43     ` Jason Xing
2025-02-06 10:22       ` Jason Xing
2025-02-06 16:13       ` Willem de Bruijn
2025-02-07  0:22         ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 06/12] bpf: support SCM_TSTAMP_SCHED " Jason Xing
2025-02-05 15:36   ` Willem de Bruijn
2025-02-05 15:55     ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 07/12] bpf: support sw SCM_TSTAMP_SND " Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 08/12] bpf: support hw " Jason Xing
2025-02-05 15:45   ` Willem de Bruijn
2025-02-05 16:03     ` Jason Xing
2025-02-10 22:39       ` Martin KaFai Lau
2025-02-11  0:00         ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 09/12] bpf: support SCM_TSTAMP_ACK " Jason Xing
2025-02-05 15:47   ` Willem de Bruijn
2025-02-05 16:06     ` Jason Xing
2025-02-05 21:25       ` Willem de Bruijn
2025-02-04 18:30 ` [PATCH bpf-next v8 10/12] bpf: make TCP tx timestamp bpf extension work Jason Xing
2025-02-05  1:57   ` Jakub Kicinski
2025-02-05  2:15     ` Jason Xing
2025-02-05 21:57     ` Martin KaFai Lau
2025-02-06  0:12       ` Jason Xing
2025-02-06  0:42         ` Jason Xing
2025-02-06  0:47         ` Martin KaFai Lau
2025-02-06  1:05           ` Jason Xing
2025-02-06  2:39             ` Jason Xing
2025-02-06  2:56               ` Willem de Bruijn
2025-02-06  3:09                 ` Jason Xing
2025-02-06  3:25                   ` Willem de Bruijn
2025-02-06  3:41                     ` Jason Xing
2025-02-06  6:12                       ` Martin KaFai Lau
2025-02-06  6:56                         ` Jason Xing
2025-02-07  2:07                           ` Martin KaFai Lau
2025-02-07  2:18                             ` Jason Xing
2025-02-07 12:07                               ` Jason Xing
2025-02-08  2:11                                 ` Martin KaFai Lau
2025-02-08  6:53                                   ` Jason Xing
2025-02-07 13:34                             ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 11/12] bpf: add a new callback in tcp_tx_timestamp() Jason Xing
2025-02-05  5:28   ` Jason Xing
2025-02-04 18:30 ` [PATCH bpf-next v8 12/12] selftests/bpf: add simple bpf tests in the tx path for timestamping feature Jason Xing
2025-02-05 15:54   ` Willem de Bruijn
2025-02-05 16:08     ` Jason Xing
2025-02-06  1:28       ` Martin KaFai Lau
2025-02-06  2:14         ` Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f3159c8-d2fe-4fc0-8f09-260aa84e12bd@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).