public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Chaignon <paul.chaignon@gmail.com>
To: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Amery Hung <ameryhung@gmail.com>,
	bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v5 3/5] bpf: Craft non-linear skbs in BPF_PROG_TEST_RUN
Date: Mon, 6 Oct 2025 22:50:21 +0200	[thread overview]
Message-ID: <aOQrjeaErvdq--wT@mail.gmail.com> (raw)
In-Reply-To: <82c4c97e-83cc-4f42-b3a0-11f46a7495d6@linux.dev>

On Mon, Oct 06, 2025 at 11:58:41AM -0700, Martin KaFai Lau wrote:
> On 10/6/25 7:04 AM, Paul Chaignon wrote:
> > On Thu, Oct 02, 2025 at 11:27:52AM -0700, Martin KaFai Lau wrote:
> > > On 10/2/25 3:07 AM, Paul Chaignon wrote:
> > > > This patch adds support for crafting non-linear skbs in BPF test runs
> > > > for tc programs. The size of the linear area is given by ctx->data_end,
> > > > with a minimum of ETH_HLEN always pulled in the linear area. If ctx or
> > > > ctx->data_end are null, a linear skb is used.
> > > > 
> > > > This is particularly useful to test support for non-linear skbs in large
> > > > codebases such as Cilium. We've had multiple bugs in the past few years
> > > > where we were missing calls to bpf_skb_pull_data(). This support in
> > > > BPF_PROG_TEST_RUN would allow us to automatically cover this case in our
> > > > BPF tests.
> > > > 
> > > > In addition to the selftests introduced later in the series, this patch
> > > > was tested by setting enabling non-linear skbs for all tc selftests
> > > > programs and checking test failures were expected.
> > > > 
> > > > Tested-by: syzbot@syzkaller.appspotmail.com
> > > > Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
> > > > Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
> > > > ---
> > > >    net/bpf/test_run.c | 67 +++++++++++++++++++++++++++++++++++++++++-----
> > > >    1 file changed, 61 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > > > index 3425100b1e8c..e4f4b423646a 100644
> > > > --- a/net/bpf/test_run.c
> > > > +++ b/net/bpf/test_run.c
> > > > @@ -910,6 +910,12 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
> > > >    	/* cb is allowed */
> > > >    	if (!range_is_zero(__skb, offsetofend(struct __sk_buff, cb),
> > > > +			   offsetof(struct __sk_buff, data_end)))
> > > > +		return -EINVAL;
> > > > +
> > > > +	/* data_end is allowed, but not copied to skb */
> > > > +
> > > > +	if (!range_is_zero(__skb, offsetofend(struct __sk_buff, data_end),
> > > >    			   offsetof(struct __sk_buff, tstamp)))
> > > >    		return -EINVAL;
> > > > @@ -984,9 +990,12 @@ static struct proto bpf_dummy_proto = {
> > > >    int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> > > >    			  union bpf_attr __user *uattr)
> > > >    {
> > > > +	u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > > >    	bool is_l2 = false, is_direct_pkt_access = false;
> > > >    	struct net *net = current->nsproxy->net_ns;
> > > >    	struct net_device *dev = net->loopback_dev;
> > > > +	u32 headroom = NET_SKB_PAD + NET_IP_ALIGN;
> > > > +	u32 linear_sz = kattr->test.data_size_in;
> > > >    	u32 size = kattr->test.data_size_in;
> > > >    	u32 repeat = kattr->test.repeat;
> > > >    	struct __sk_buff *ctx = NULL;
> > > > @@ -1023,9 +1032,16 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> > > >    	if (IS_ERR(ctx))
> > > >    		return PTR_ERR(ctx);
> > > > -	data = bpf_test_init(kattr, kattr->test.data_size_in,
> > > > -			     size, NET_SKB_PAD + NET_IP_ALIGN,
> > > > -			     SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> > > > +	if (ctx) {
> > > > +		if (!is_l2 || ctx->data_end > kattr->test.data_size_in) {
> > > 
> > > What is the need for the "!is_l2" test?
> > 
> > There's nothing limiting us to only tc program types, but I was
> > wondering if it makes sense to open this (non-linear skbs) to all
> > program types. For example, cgroup_skb programs cannot call
> > bpf_skb_pull_data to deal with non-linear skbs.
> 
> The bpf_dynptr_* and the bpf_load_* can still be used to handle the a
> non-linear skb.
> 
> One thing is all "!is_l2" program will get an -EINVAL now after this patch
> if either ctx_in or ctx_out is specified. Am I reading it right?

Yes, that needs to be fixed either way.

> 
> > 
> > Even the LWT program types would require special care because ex. the
> > bpf_clone_redirect helper can end up calling eth_type_trans which
> > assumes we have at least ETH_HLEN in the linear area. I wasn't sure it
> > was worth opening this capability to these program types without a clear
> > use case.
> 
> I thought the linear_sz has been taken care of below such that it must be at
> least ETH_HLEN anyway. Why "!is_l2" still needs to be rejected?

In case of !is_l2, we will still call eth_type_trans() before running
the BPF program. That pulls the L2 header via eth_skb_pull_mac(). Then,
if we run an LWT program calling bpf_clone_redirect, it calls
eth_type_trans() again, trying to pull the L2 header a second time.

IIRC I hit this exact issue on an earlier version, before adding the
!is_l2 check, when I enabled non-linear on all programs in test_loader.
empty_skb.c was then triggering kernel BUGs.

I couldn't find a case where this is an actual issue in LWT because it
seems we always at least linearize the L3 header (for route lookups)
before running the LWT programs. And the L3 header is bigger than
ETH_HLEN so eth_type_trans() doesn't throw a kernel BUG.

> 
> I am not familiar with the LWT but I recalled we have fixed some cases on
> the ETH_HLEN and they are now tested by the prog_tests/empty_skb.c.
> 
> I think I am missing something on how is the non-linear skb different from
> the linear skb here for lwt if it has ensured there is ETH_HLEN in the
> linear. I am not sure how active is the lwt/bpf usage now. If it is hard to
> support, I think it is fine to exclude it.

I'm sure it can be done, but not sure it's worth the hassle. As you say,
it's unclear how used they are. Even for Cilium, which I believe was the
initial use case, we don't use them.

> 
> CGROUP_SKB will be good to be supported at the beginning though.

Ack. I'll send a v6 excluding only the LWT types.

[...]

> > > Should linear_sz be limited to "PAGE_SIZE - headroom..." like how
> > > test_run_xdp() does it ?
> > 
> > That changes a bit the current behavior. Currently, we will return
> > EINVAL if a user try to pass more than "PAGE_SIZE - headroom..." as
> > data_size_in. With the test_run_xdp approach, we'll end up silently
> > switching to non-linear mode if they do that.
> > 
> > I'm not against it given it brings consistency with the XDP counterpart,
> > but it could also be a bit surprising.
> 
> I think this is fine to accept what was previously rejected because of
> lacking non-linear skb support.
> 
> 
> I would prefer to have similar expectation as the test_run_xdp for the parts
> that make sense.

Ack.


  reply	other threads:[~2025-10-06 20:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02 10:03 [PATCH bpf-next v5 0/5] Support non-linear skbs for BPF_PROG_TEST_RUN Paul Chaignon
2025-10-02 10:05 ` [PATCH bpf-next v5 1/5] bpf: Refactor cleanup of bpf_prog_test_run_skb Paul Chaignon
2025-10-02 10:07 ` [PATCH bpf-next v5 2/5] bpf: Reorder bpf_prog_test_run_skb initialization Paul Chaignon
2025-10-02 10:07 ` [PATCH bpf-next v5 3/5] bpf: Craft non-linear skbs in BPF_PROG_TEST_RUN Paul Chaignon
2025-10-02 16:07   ` Amery Hung
2025-10-02 16:28     ` Amery Hung
2025-10-02 18:27   ` Martin KaFai Lau
2025-10-06 14:04     ` Paul Chaignon
2025-10-06 18:58       ` Martin KaFai Lau
2025-10-06 20:50         ` Paul Chaignon [this message]
2025-10-02 10:07 ` [PATCH bpf-next v5 4/5] selftests/bpf: Support non-linear flag in test loader Paul Chaignon
2025-10-02 10:07 ` [PATCH bpf-next v5 5/5] selftests/bpf: Test direct packet access on non-linear skbs Paul Chaignon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOQrjeaErvdq--wT@mail.gmail.com \
    --to=paul.chaignon@gmail.com \
    --cc=ameryhung@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=martin.lau@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox