From: Martin KaFai Lau <martin.lau@linux.dev>
To: Feng Yang <yangfeng59949@163.com>
Cc: andrii@kernel.org, ast@kernel.org, bpf@vger.kernel.org,
daniel@iogearbox.net, davem@davemloft.net, edumazet@google.com,
horms@kernel.org, kuba@kernel.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, pabeni@redhat.com
Subject: Re: [PATCH v5] bpf: test_run: Fix the null pointer dereference issue in bpf_lwt_xmit_push_encap
Date: Tue, 17 Feb 2026 10:08:29 -0800 [thread overview]
Message-ID: <6528c028-1224-4737-b902-e631641ddb37@linux.dev> (raw)
In-Reply-To: <20260211075240.70095-1-yangfeng59949@163.com>
On 2/10/26 11:52 PM, Feng Yang wrote:
> On Tue, 10 Feb 2026 11:10:03 -0800 Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
>> On 2/10/26 1:06 AM, Feng Yang wrote:
>>> From: Feng Yang <yangfeng@kylinos.cn>
>>>
>>> The bpf_lwt_xmit_push_encap helper needs to access skb_dst(skb)->dev to
>>> calculate the needed headroom:
>>>
>>> err = skb_cow_head(skb,
>>> len + LL_RESERVED_SPACE(skb_dst(skb)->dev));
>>>
>>> But skb->_skb_refdst may not be initialized when the skb is set up by
>>> bpf_prog_test_run_skb function. Executing bpf_lwt_push_ip_encap function
>>> in this scenario will trigger null pointer dereference, causing a kernel
>>> crash as Yinhao reported:
>>>
>>> [ 105.186365] BUG: kernel NULL pointer dereference, address: 0000000000000000
>>> [ 105.186382] #PF: supervisor read access in kernel mode
>>> [ 105.186388] #PF: error_code(0x0000) - not-present page
>>> [ 105.186393] PGD 121d3d067 P4D 121d3d067 PUD 106c83067 PMD 0
>>> [ 105.186404] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> [ 105.186412] CPU: 3 PID: 3250 Comm: poc Kdump: loaded Not tainted 6.19.0-rc5 #1
>>> [ 105.186423] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>>> [ 105.186427] RIP: 0010:bpf_lwt_push_ip_encap+0x1eb/0x520
>>> [ 105.186443] Code: 0f 84 de 01 00 00 0f b7 4a 04 66 85 c9 0f 85 47 01 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 48 8b 73 58 48 83 e6 fe <48> 8b 36 0f > b7 be ec 00 00 00 0f b7 b6 e6 00 00 00 01 fe 83 e6 f0
>>> [ 105.186449] RSP: 0018:ffffbb0e0387bc50 EFLAGS: 00010246
>>> [ 105.186455] RAX: 000000000000004e RBX: ffff94c74e036500 RCX: ffff94c74874da00
>>> [ 105.186460] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94c74e036500
>>> [ 105.186463] RBP: 0000000000000001 R08: 0000000000000002 R09: 0000000000000000
>>> [ 105.186467] R10: ffffbb0e0387bd50 R11: 0000000000000000 R12: ffffbb0e0387bc98
>>> [ 105.186471] R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000002
>>> [ 105.186484] FS: 00007f166aa4d680(0000) GS:ffff94c8b7780000(0000) knlGS:0000000000000000
>>> [ 105.186490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 105.186494] CR2: 0000000000000000 CR3: 000000015eade001 CR4: 0000000000770ee0
>>> [ 105.186499] PKRU: 55555554
>>> [ 105.186502] Call Trace:
>>> [ 105.186507] <TASK>
>>> [ 105.186513] bpf_lwt_xmit_push_encap+0x2b/0x40
>>> [ 105.186522] bpf_prog_a75eaad51e517912+0x41/0x49
>>> [ 105.186536] ? kvm_clock_get_cycles+0x18/0x30
>>> [ 105.186547] ? ktime_get+0x3c/0xa0
>>> [ 105.186554] bpf_test_run+0x195/0x320
>>> [ 105.186563] ? bpf_test_run+0x10f/0x320
>>> [ 105.186579] bpf_prog_test_run_skb+0x2f5/0x4f0
>>> [ 105.186590] __sys_bpf+0x69c/0xa40
>>> [ 105.186603] __x64_sys_bpf+0x1e/0x30
>>> [ 105.186611] do_syscall_64+0x59/0x110
>>> [ 105.186620] entry_SYSCALL_64_after_hwframe+0x76/0xe0
>>> [ 105.186649] RIP: 0033:0x7f166a97455d
>>>
>>> Temporarily add the setting of skb->_skb_refdst before bpf_test_run to resolve the issue.
>>>
>>> Fixes: 52f278774e79 ("bpf: implement BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap")
>>> Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
>>> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
>>> Closes: https://groups.google.com/g/hust-os-kernel-patches/c/8-a0kPpBW2s
>>> Signed-off-by: Yun Lu <luyun@kylinos.cn>
>>> Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
>>> Tested-by: syzbot@syzkaller.appspotmail.com
>>> ---
>>> Changes in v5:
>>> - Refer to the bpf_lwt_xmit_reroute function to configure the dst parameter.
>>> - Link to v4: https://lore.kernel.org/all/20260209015111.28144-1-yangfeng59949@163.com/
>>> Changes in v4:
>>> - add rcu lock
>>> - Link to v3: https://lore.kernel.org/all/20260206055113.63476-1-yangfeng59949@163.com/
>>> Changes in v3:
>>> - use dst_init
>>> - Link to v2: https://lore.kernel.org/all/20260205092227.126665-1-yangfeng59949@163.com/
>>> Changes in v2:
>>> - Link to v1: https://lore.kernel.org/all/20260127084520.13890-1-luyun_611@163.com/
>>
>> The earlier syzbot reports are crying for a selftest which is still
>> missing in v5.
>>
>> The CI has also reported errors in the test_progs. Did you run any test
>> before posting ?
>
> My apologies. I only tested whether a crash would occur
> when using `bpf_prog_test_run_skb` to execute `bpf_lwt_push_ip_encap` without a dst entry.
> I will include a selftest in my next submission.
>
>> pw-bot: cr
>>
>>> ---
>>> net/bpf/test_run.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 47 insertions(+)
>>>
>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>>> index 178c4738e63b..dbd2c44da7e5 100644
>>> --- a/net/bpf/test_run.c
>>> +++ b/net/bpf/test_run.c
>>> @@ -24,6 +24,7 @@
>>> #include <net/netdev_rx_queue.h>
>>> #include <net/xdp.h>
>>> #include <net/netfilter/nf_bpf_link.h>
>>> +#include <net/ip6_route.h>
>>>
>>> #define CREATE_TRACE_POINTS
>>> #include <trace/events/bpf_test_run.h>
>>> @@ -992,6 +993,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>>> u32 headroom = NET_SKB_PAD + NET_IP_ALIGN;
>>> u32 linear_sz = kattr->test.data_size_in;
>>> u32 repeat = kattr->test.repeat;
>>> + struct dst_entry *dst = NULL;
>>> struct __sk_buff *ctx = NULL;
>>> struct sk_buff *skb = NULL;
>>> struct sock *sk = NULL;
>>> @@ -1156,6 +1158,51 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>>> skb->ip_summed = CHECKSUM_COMPLETE;
>>> }
>>>
>>> + if (prog->type == BPF_PROG_TYPE_LWT_XMIT && !skb_dst(skb)) {
>>> + struct flowi4 fl4 = {};
>>> + struct flowi6 fl6 = {};
>>> + struct rtable *rt;
>>> +
>>> + switch (skb->protocol) {
>>> + case htons(ETH_P_IP):
>>> + if (sizeof(struct iphdr) <= skb_headlen(skb)) {
>>> + fl4.saddr = ip_hdr(skb)->saddr;
>>> + fl4.daddr = ip_hdr(skb)->daddr;
>>> + }
>>> +
>>> + rt = ip_route_output_key(net, &fl4);
>>
>> What can be expected from the return value rt if fl4 is 0.
>
> If it is empty, `ip_route_output_key_hash_rcu` will assign the loopback address,
> and the returned `rt` is valid, not an error value.
>
>>> + if (IS_ERR(rt)) {
>>> + ret = PTR_ERR(rt);
>>
>> I suspect this is probably what failed in CI. Add a NULL check in
>> bpf_lwt_push_ip_encap instead. No change is needed in the
>> bpf_prog_test_run_skb.
>
> The root cause of the CI failure is that the default path was taken, resulting in a return value of -EINVAL.
> So, is your suggestion to modify the bpf_lwt_push_ip_encap function as in the v1(https://lore.kernel.org/all/20260127084520.13890-1-luyun_611@163.com/) version?
> However, the previous suggestion was to make changes to the bpf_prog_test_run_skb function instead.
>
> Thank you very much for your reply.
>
>>> + goto out;
>>> + }
>>> + dst = &rt->dst;
>>> + break;
>>> +#if IS_ENABLED(CONFIG_IPV6)
>>> + case htons(ETH_P_IPV6):
>>> + if (sizeof(struct ipv6hdr) <= skb_headlen(skb)) {
>>> + fl6.saddr = ipv6_hdr(skb)->saddr;
>>> + fl6.daddr = ipv6_hdr(skb)->daddr;
>>> + }
>>> +
>>> + dst = ip6_route_output(net, NULL, &fl6);
>>> + if (IS_ERR(dst)) {
>>> + ret = PTR_ERR(dst);
>>> + goto out;
>>> + }
>>> + break;
>>> +#endif
>>> + default:
>>> + ret = -EINVAL;
>
> The reason for the CI test failure.
Make sense, but the early point stays the same: the user-provided skb
can have unexpected data. Either an skb->protocol is not handled here,
or the earlier dst lookup has an error. I don't know what the current
active users left in lwt are. Unless there is an issue with missing
skb_dst() in other mainstream program types (e.g., tc), I would prefer
to add a check in bpf_lwt_push_ip_encap() instead of complicating
bpf_prog_test_run_skb() further.
>
>>> + goto out;
>>> + }
>>> +
>>> + if (unlikely(dst->error)) {
>>> + ret = dst->error;
>>> + dst_release(dst);
>>> + goto out;
>>> + }
>>> + skb_dst_set(skb, dst);
>>> + }
>>> ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false);
>>> if (ret)
>>> goto out;
>
next prev parent reply other threads:[~2026-02-17 18:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 9:06 [PATCH v5] bpf: test_run: Fix the null pointer dereference issue in bpf_lwt_xmit_push_encap Feng Yang
2026-02-10 18:34 ` kernel test robot
2026-02-10 19:08 ` kernel test robot
2026-02-10 19:10 ` Martin KaFai Lau
2026-02-11 7:52 ` Feng Yang
2026-02-17 18:08 ` Martin KaFai Lau [this message]
2026-02-17 22:08 ` Jakub Kicinski
2026-02-17 23:26 ` Martin KaFai Lau
2026-02-17 23:39 ` Jakub Kicinski
2026-02-10 20:26 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6528c028-1224-4737-b902-e631641ddb37@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=yangfeng59949@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox