netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: andrii@kernel.org, ast@kernel.org, bpf@vger.kernel.org,
	daniel@iogearbox.net, dxu@dxuuu.xyz, edumazet@google.com,
	kuni1840@gmail.com, netdev@vger.kernel.org,
	Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH v5 bpf-next 6/6] selftest: bpf: Test bpf_sk_assign_tcp_reqsk().
Date: Wed, 13 Dec 2023 22:46:11 -0800	[thread overview]
Message-ID: <3186bf18-a8fd-4b30-a080-61beb13f19f7@linux.dev> (raw)
In-Reply-To: <20231214031819.83105-1-kuniyu@amazon.com>

On 12/13/23 7:18 PM, Kuniyuki Iwashima wrote:
>>> +static int tcp_parse_option(__u32 index, struct tcp_syncookie *ctx)
>>> +{
>>> +	struct tcp_options_received *tcp_opt = &ctx->attr.tcp_opt;
>>> +	char opcode, opsize;
>>> +
>>> +	if (ctx->ptr + 1 > ctx->data_end)
>>> +		goto stop;
>>> +
>>> +	opcode = *ctx->ptr++;
>>> +
>>> +	if (opcode == TCPOPT_EOL)
>>> +		goto stop;
>>> +
>>> +	if (opcode == TCPOPT_NOP)
>>> +		goto next;
>>> +
>>> +	if (ctx->ptr + 1 > ctx->data_end)
>>> +		goto stop;
>>> +
>>> +	opsize = *ctx->ptr++;
>>> +
>>> +	if (opsize < 2)
>>> +		goto stop;
>>> +
>>> +	switch (opcode) {
>>> +	case TCPOPT_MSS:
>>> +		if (opsize == TCPOLEN_MSS && ctx->tcp->syn &&
>>> +		    ctx->ptr + (TCPOLEN_MSS - 2) < ctx->data_end)
>>> +			tcp_opt->mss_clamp = get_unaligned_be16(ctx->ptr);
>>> +		break;
>>> +	case TCPOPT_WINDOW:
>>> +		if (opsize == TCPOLEN_WINDOW && ctx->tcp->syn &&
>>> +		    ctx->ptr + (TCPOLEN_WINDOW - 2) < ctx->data_end) {
>>> +			tcp_opt->wscale_ok = 1;
>>> +			tcp_opt->snd_wscale = *ctx->ptr;
>> When writing to a bitfield of "struct tcp_options_received" which is a kernel
>> struct, it needs to use the CO-RE api. The BPF_CORE_WRITE_BITFIELD has not been
>> landed yet:
>> https://lore.kernel.org/bpf/4d3dd215a4fd57d980733886f9c11a45e1a9adf3.1702325874.git.dxu@dxuuu.xyz/
>>
>> The same for reading bitfield but BPF_CORE_READ_BITFIELD() has already been
>> implemented in bpf_core_read.h
>>
>> Once the BPF_CORE_WRITE_BITFIELD is landed, this test needs to be changed to use
>> the BPF_CORE_{READ,WRITE}_BITFIELD.
> IIUC, the CO-RE api assumes that the offset of bitfields could be changed.
> 
> If the size of struct tcp_cookie_attributes is changed, kfunc will not work
> in this test.  So, BPF_CORE_WRITE_BITFIELD() works only when the size of
> tcp_cookie_attributes is unchanged but fields in tcp_options_received are
> rearranged or expanded to use the unused@ bits ?

Right, CO-RE helps to figure out the offset of a member in the running kernel.

> 
> Also, do we need to use BPF_CORE_READ() for other non-bitfields in
> strcut tcp_options_received (and ecn_ok in struct tcp_cookie_attributes
> just in case other fields are added to tcp_cookie_attributes and ecn_ok
> is rearranged) ?

BPF_CORE_READ is a CO-RE friendly macro for using bpf_probe_read_kernel(). 
bpf_probe_read_kernel() is mostly for the tracing use case where the ptr is not 
safe to read directly.

It is not the case for the tcp_options_received ptr in this tc-bpf use case or 
other stack allocated objects. In general, no need to use BPF_CORE_READ. The 
relocation will be done by the libbpf for tcp_opt->mss_clamp (e.g.).

Going back to bitfield, it needs BPF_CORE_*_BITFIELD because the offset may not 
be right after __attribute__((preserve_access_index)), cc: Yonghong and Andrii 
who know more details than I do.

A verifier error has been reported: 
https://lore.kernel.org/bpf/391d524c496acc97a8801d8bea80976f58485810.1700676682.git.dxu@dxuuu.xyz/.

I also hit an error earlier in 
https://lore.kernel.org/all/20220817061847.4182339-1-kafai@fb.com/ when not 
using BPF_CORE_READ_BITFIELD. I don't exactly remember how the instruction looks 
like but it was reading a wrong value instead of verifier error.

================

Going back to this patch set here.

After sleeping on it longer, I am thinking it is better not to reuse 'struct 
tcp_options_received' (meaning no bitfield) in the bpf_sk_assign_tcp_reqsk() 
kfunc API.

There is not much benefit in reusing 'tcp_options_received'. When new tcp option 
was ever added to tcp_options_received, it is not like bpf_sk_assign_tcp_reqsk 
will support it automatically. It needs to relay this new option back to the 
allocated req. Unlike tcp_sock or req which may have a lot of them such that it 
is useful to have a compact tcp_options_received, the tc-bpf use case here is to 
allocate it once in the stack. Also, not all the members in tcp_options_received 
is useful, e.g. num_sacks, ts_recent_stamp, and user_mss are not used. Leaving 
it there being ignored by bpf_sk_assign_tcp_reqsk is confusing.

How about using a full u8 for each necessary member and directly add them to 
struct tcp_cookie_attributes instead of nesting them into another struct. After 
taking out the unnecessary members, the size may not end up to be much bigger.

The bpf prog can then directly access attr->tstamp_ok more naturally. The 
changes to patch 5 and 6 should be mostly mechanical changes.

I would also rename s/tcp_cookie_attributes/bpf_tcp_req_attrs/.

wdyt?



  reply	other threads:[~2023-12-14  6:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11  7:36 [PATCH v5 bpf-next 0/6] bpf: tcp: Support arbitrary SYN Cookie at TC Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 1/6] tcp: Move tcp_ns_to_ts() to tcp.h Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 2/6] tcp: Move skb_steal_sock() to request_sock.h Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 3/6] bpf: tcp: Handle BPF SYN Cookie in skb_steal_sock() Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 4/6] bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check() Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 5/6] bpf: tcp: Support arbitrary SYN Cookie Kuniyuki Iwashima
2023-12-11  7:36 ` [PATCH v5 bpf-next 6/6] selftest: bpf: Test bpf_sk_assign_tcp_reqsk() Kuniyuki Iwashima
2023-12-13 20:44   ` Martin KaFai Lau
2023-12-14  3:18     ` Kuniyuki Iwashima
2023-12-14  6:46       ` Martin KaFai Lau [this message]
2023-12-14  7:49         ` Kuniyuki Iwashima
2023-12-14 12:26           ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3186bf18-a8fd-4b30-a080-61beb13f19f7@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dxu@dxuuu.xyz \
    --cc=edumazet@google.com \
    --cc=kuni1840@gmail.com \
    --cc=kuniyu@amazon.com \
    --cc=netdev@vger.kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).