netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Jason Xing <kerneljasonxing@gmail.com>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, dsahern@kernel.org,
	willemdebruijn.kernel@gmail.com, willemb@google.com,
	ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me,
	haoluo@google.com, jolsa@kernel.org, horms@kernel.org,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH net-next v5 03/15] bpf: introduce timestamp_used to allow UDP socket fetched in bpf prog
Date: Tue, 14 Jan 2025 17:17:20 -0800	[thread overview]
Message-ID: <02031003-872e-49bf-a658-c22bc7e1a954@linux.dev> (raw)
In-Reply-To: <20250112113748.73504-4-kerneljasonxing@gmail.com>

On 1/12/25 3:37 AM, Jason Xing wrote:
> timestamp_used consists of two parts, one is is_fullsock, the other
> one is for UDP socket which will be support in the next round.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
>   include/linux/filter.h | 1 +
>   net/core/filter.c      | 4 ++--
>   net/core/sock.c        | 1 +
>   net/ipv4/tcp_input.c   | 2 ++
>   net/ipv4/tcp_output.c  | 2 ++
>   5 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index a3ea46281595..daca3fe48b8f 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -1508,6 +1508,7 @@ struct bpf_sock_ops_kern {
>   	void	*skb_data_end;
>   	u8	op;
>   	u8	is_fullsock;
> +	u8	timestamp_used;
>   	u8	remaining_opt_len;
>   	u64	temp;			/* temp and everything after is not
>   					 * initialized to 0 before calling
> diff --git a/net/core/filter.c b/net/core/filter.c
> index c6dd2d2e44c8..1ac996ec5e0f 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -10424,10 +10424,10 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
>   		}							      \
>   		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(			      \
>   						struct bpf_sock_ops_kern,     \
> -						is_fullsock),		      \
> +						timestamp_used),	      \
>   				      fullsock_reg, si->src_reg,	      \
>   				      offsetof(struct bpf_sock_ops_kern,      \
> -					       is_fullsock));		      \
> +					       timestamp_used));	      \

hmm... I don't think it is the right change. This change may disallow the bpf 
prog from reading skops->sk. It is fine to allow bpf prog (includes the new 
timestamp callback) getting the skops->sk as long as skops->sk is a fullsock.

The actual thing that needs to address is writing to sk, like:

	case offsetof(struct bpf_sock_ops, sk_txhash):
		SOCK_OPS_GET_OR_SET_FIELD(sk_txhash, sk_txhash,
                                           struct sock, type);


and also all the SOCK_OPS_GET_TCP_SOCK_FIELD() to prepare for the udp sock 
support. After this patch 3, I think I start to understand the udp/fullsock 
discussion in patch 2. is_fullsock here does not mean it is tcp, although it is 
always a tcp_sock now. It literally means it is a full "struct sock". The 
verifier will treat the skops->sk as "struct sock" instead of "struct tcp_sock".

>   		*insn++ = BPF_JMP_IMM(BPF_JEQ, fullsock_reg, 0, jmp);	      \
>   		if (si->dst_reg == si->src_reg)				      \
>   			*insn++ = BPF_LDX_MEM(BPF_DW, reg, si->src_reg,	      \
> diff --git a/net/core/sock.c b/net/core/sock.c
> index e06bcafb1b2d..dbb9326ae9d1 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -958,6 +958,7 @@ void bpf_skops_tx_timestamping(struct sock *sk, struct sk_buff *skb, int op)
>   	if (sk_is_tcp(sk) && sk_fullsock(sk))
>   		sock_ops.is_fullsock = 1;
>   	sock_ops.sk = sk;
> +	sock_ops.timestamp_used = 1;
>   	__cgroup_bpf_run_filter_sock_ops(sk, &sock_ops, CGROUP_SOCK_OPS);
>   }
>   #endif
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4811727b8a02..cad41ad34bd5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -169,6 +169,7 @@ static void bpf_skops_parse_hdr(struct sock *sk, struct sk_buff *skb)
>   	memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp));
>   	sock_ops.op = BPF_SOCK_OPS_PARSE_HDR_OPT_CB;
>   	sock_ops.is_fullsock = 1;
> +	sock_ops.timestamp_used = 1;
>   	sock_ops.sk = sk;
>   	bpf_skops_init_skb(&sock_ops, skb, tcp_hdrlen(skb));
>   
> @@ -185,6 +186,7 @@ static void bpf_skops_established(struct sock *sk, int bpf_op,
>   	memset(&sock_ops, 0, offsetof(struct bpf_sock_ops_kern, temp));
>   	sock_ops.op = bpf_op;
>   	sock_ops.is_fullsock = 1;
> +	sock_ops.timestamp_used = 1;
>   	sock_ops.sk = sk;
>   	/* sk with TCP_REPAIR_ON does not have skb in tcp_finish_connect */
>   	if (skb)
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 0e5b9a654254..7b4d1dfd57d4 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -522,6 +522,7 @@ static void bpf_skops_hdr_opt_len(struct sock *sk, struct sk_buff *skb,
>   		sock_owned_by_me(sk);
>   
>   		sock_ops.is_fullsock = 1;
> +		sock_ops.timestamp_used = 1;
>   		sock_ops.sk = sk;
>   	}
>   
> @@ -567,6 +568,7 @@ static void bpf_skops_write_hdr_opt(struct sock *sk, struct sk_buff *skb,
>   		sock_owned_by_me(sk);
>   
>   		sock_ops.is_fullsock = 1;
> +		sock_ops.timestamp_used = 1;

The "timestamp_used = 1;' assignment has missed some places. At least in the 
tcp_call_bpf().

Also, the name "timestamp_used" is confusing. Like setting timestamp_used in the 
bpf_skops_*_hdr_opt() callback here when it is not a timestamp callback.

Altogether, need to rethink what to add to sock_ops instead of timestamp_used 
and it should be checked in "some" of the SOCK_OPS_*_FIELD(). A quick thought 
(not 100% sure) is to add "u8 allow_direct_access" which is only set for the 
existing sockops callbacks.

[ I will continue the rest later. ]

>   		sock_ops.sk = sk;
>   	}
>   


  reply	other threads:[~2025-01-15  1:17 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-12 11:37 [PATCH net-next v5 00/15] net-timestamp: bpf extension to equip applications transparently Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 01/15] net-timestamp: add support for bpf_setsockopt() Jason Xing
2025-01-12 14:49   ` kernel test robot
2025-01-13  0:11     ` Jason Xing
2025-01-13  7:32       ` Jason Xing
2025-01-14 23:20   ` Martin KaFai Lau
2025-01-14 23:29     ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 02/15] net-timestamp: prepare for bpf prog use Jason Xing
2025-01-14 23:39   ` Martin KaFai Lau
2025-01-15  0:09     ` Jason Xing
2025-01-15  0:15       ` Jason Xing
2025-01-15  0:26         ` Martin KaFai Lau
2025-01-15  0:37           ` Jason Xing
2025-01-15  0:43             ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 03/15] bpf: introduce timestamp_used to allow UDP socket fetched in bpf prog Jason Xing
2025-01-15  1:17   ` Martin KaFai Lau [this message]
2025-01-15  2:28     ` Jason Xing
2025-01-15  2:54       ` Jason Xing
2025-01-16  0:51         ` Martin KaFai Lau
2025-01-16  1:12           ` Jason Xing
2025-01-18  1:42             ` Martin KaFai Lau
2025-01-18  1:58               ` Jason Xing
2025-01-18  2:16                 ` Martin KaFai Lau
2025-01-18  2:37                   ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 04/15] net-timestamp: support SK_BPF_CB_FLAGS only in bpf_sock_ops_setsockopt Jason Xing
2025-01-15 21:22   ` Martin KaFai Lau
2025-01-15 23:26     ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 05/15] net-timestamp: add strict check in some BPF calls Jason Xing
2025-01-12 14:37   ` kernel test robot
2025-01-13  0:28     ` Jason Xing
2025-01-15 21:48   ` Martin KaFai Lau
2025-01-15 23:32     ` Jason Xing
2025-01-18  2:15       ` Martin KaFai Lau
2025-01-18  6:28         ` Jason Xing
2025-01-17 10:18   ` kernel test robot
2025-01-12 11:37 ` [PATCH net-next v5 06/15] net-timestamp: prepare for isolating two modes of SO_TIMESTAMPING Jason Xing
2025-01-15 22:11   ` Martin KaFai Lau
2025-01-15 23:50     ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 07/15] net-timestamp: support SCM_TSTAMP_SCHED for bpf extension Jason Xing
2025-01-15 22:32   ` Martin KaFai Lau
2025-01-15 23:57     ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 08/15] net-timestamp: support sw SCM_TSTAMP_SND " Jason Xing
2025-01-15 22:48   ` Martin KaFai Lau
2025-01-15 23:56     ` Jason Xing
2025-01-18  0:46       ` Martin KaFai Lau
2025-01-18  1:43         ` Jason Xing
2025-01-19 13:38           ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 09/15] net-timestamp: support SCM_TSTAMP_ACK " Jason Xing
2025-01-15 23:02   ` Martin KaFai Lau
2025-01-12 11:37 ` [PATCH net-next v5 10/15] net-timestamp: support hw SCM_TSTAMP_SND " Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 11/15] net-timestamp: support export skb to the userspace Jason Xing
2025-01-15 23:05   ` Martin KaFai Lau
2025-01-15 23:59     ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 12/15] net-timestamp: make TCP tx timestamp bpf extension work Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 13/15] net-timestamp: support tcp_sendmsg for bpf extension Jason Xing
2025-01-16  0:03   ` Martin KaFai Lau
2025-01-16  0:41     ` Jason Xing
2025-01-16  1:18       ` Martin KaFai Lau
2025-01-16  1:22         ` Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 14/15] net-timestamp: introduce cgroup lock to avoid affecting non-bpf cases Jason Xing
2025-01-12 11:37 ` [PATCH net-next v5 15/15] bpf: add simple bpf tests in the tx path for so_timestamping feature Jason Xing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02031003-872e-49bf-a658-c22bc7e1a954@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=haoluo@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kerneljasonxing@gmail.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).