From: Jakub Kicinski <kuba@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Rosenberg <drosen@google.com>, bpf <bpf@vger.kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
John Fastabend <john.fastabend@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
Joanne Koong <joannelkoong@gmail.com>,
Mykola Lysenko <mykolal@fb.com>,
LKML <linux-kernel@vger.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@vger.kernel.org>,
Android Kernel Team <kernel-team@android.com>
Subject: Re: [PATCH v2 1/3] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)
Date: Tue, 18 Jul 2023 11:11:01 -0700 [thread overview]
Message-ID: <20230718111101.57b1d411@kernel.org> (raw)
In-Reply-To: <CAADnVQ+jAo4V-Pa9_LhJEwG0QquL-Ld5S99v3LNUtgkiiYwfzw@mail.gmail.com>
On Tue, 18 Jul 2023 10:50:14 -0700 Alexei Starovoitov wrote:
> On Tue, Jul 18, 2023 at 10:18 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > you're still missing the point. Pls read the whole patch series.
> >
> > Could you just tell me what the point is then? The "series" is one
> > patch plus some tiny selftests. I don't see any documentation for
> > how dynptrs are supposed to work either.
> >
> > As far as I can grasp this makes the "copy buffer" optional from
> > the kfunc-API perspective (of bpf_dynptr_slice()).
> >
> > > It is _not_ input validation.
> > > skb_copy_bits is a slow path. One extra check doesn't affect
> > > performance at all. So 'fast paths' isn't a valid argument here.
> > > The code is reusing
> > > if (likely(hlen - offset >= len))
> > > return (void *)data + offset;
> > > which _is_ the fast path.
> > >
> > > What you're requesting is to copy paste
> > > the whole __skb_header_pointer into __skb_header_pointer2.
> > > Makes no sense.
> >
> > No, Alexei, the whole point of skb_header_pointer() is to pass
> > the secondary buffer, to make header parsing dependable.
>
> of course. No one argues about that.
>
> > Passing NULL buffer to skb_header_pointer() is absolutely nonsensical.
>
> Quick grep through the code proves you wrong:
> drivers/net/ethernet/broadcom/bnxt/bnxt.c
> __skb_header_pointer(NULL, start, sizeof(*hp), skb->data,
> skb_headlen(skb), NULL);
>
> was done before this patch. It's using __ variant on purpose
> and explicitly passing skb==NULL to exactly trigger that line
> to deliberately avoid the slow path.
>
> Another example:
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> skb_header_pointer(skb, 0, 0, NULL);
>
> This one I'm not sure about. Looks buggy.
These are both Tx path for setting up offloads, Linux doesn't request
offloads for headers outside of the linear part. The ixgbevf code is
completely pointless, as you say.
In general drivers are rarely a source of high quality code examples.
Having been directly involved in the bugs that lead to the bnxt code
being written - I was so happy that the driver started parsing Tx
packets *at all*, so I wasn't too fussed by the minor problems :(
> > It should *not* be supported. We had enough prod problems with people
> > thinking that the entire header will be in the linear portion.
> > Then either the NIC can't parse the header, someone enables jumbo,
> > disables GRO, adds new HW, adds encap, etc etc and things implode.
>
> I don't see how this is related.
> NULL buffer allows to get a linear pointer and explicitly avoids
> slow path when it's not linear.
Direct packet access via skb->data is there for those who want high
speed 🤷️
> > If you want to support it in BPF that's up to you, but I think it's
> > entirely reasonable for me to request that you don't do such things
> > in general networking code. The function is 5 LoC, so a local BPF
> > copy seems fine. Although I'd suggest skb_header_pointer_misguided()
> > rather than __skb_header_pointer2() as the name :)
>
> If you insist we can, but bnxt is an example that buffer==NULL is
> a useful concept for networking and not bpf specific.
> It also doesn't make "people think the header is linear" any worse.
My worry is that people will think that whether the buffer is needed or
not depends on _their program_, rather than on the underlying platform.
So if it works in testing without the buffer - the buffer must not be
required for their use case.
next prev parent reply other threads:[~2023-07-18 18:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-02 0:52 [PATCH v2 1/3] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw) Daniel Rosenberg
2023-05-02 0:52 ` [PATCH v2 2/3] selftests/bpf: Test allowing NULL buffer in dynptr slice Daniel Rosenberg
2023-05-03 16:20 ` Alexei Starovoitov
2023-05-02 0:52 ` [PATCH v2 3/3] selftests/bpf: Check overflow in optional buffer Daniel Rosenberg
2023-05-03 18:49 ` [PATCH v2 1/3] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw) Andrii Nakryiko
2023-07-18 15:26 ` Jakub Kicinski
2023-07-18 15:52 ` Alexei Starovoitov
2023-07-18 16:06 ` Jakub Kicinski
2023-07-18 16:52 ` Alexei Starovoitov
2023-07-18 17:18 ` Jakub Kicinski
2023-07-18 17:50 ` Alexei Starovoitov
2023-07-18 18:11 ` Jakub Kicinski [this message]
2023-07-18 20:34 ` Alexei Starovoitov
2023-07-18 23:06 ` Jakub Kicinski
2023-07-18 23:17 ` Alexei Starovoitov
2023-07-18 23:21 ` Jakub Kicinski
2023-07-18 23:22 ` Alexei Starovoitov
2023-07-19 14:51 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230718111101.57b1d411@kernel.org \
--to=kuba@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=drosen@google.com \
--cc=haoluo@google.com \
--cc=joannelkoong@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kernel-team@android.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=mykolal@fb.com \
--cc=sdf@google.com \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).