* [PATCH bpf v4 0/2] bpf: Fix generic devmap egress skb sharing @ 2026-06-11 8:08 Sun Jian 2026-06-11 8:08 ` [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb Sun Jian 2026-06-11 8:08 ` [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite Sun Jian 0 siblings, 2 replies; 6+ messages in thread From: Sun Jian @ 2026-06-11 8:08 UTC (permalink / raw) To: bpf Cc: netdev, linux-kernel, linux-kselftest, ast, daniel, andrii, martin.lau, davem, kuba, hawk, john.fastabend, sdf, shuah, jiayuan.chen, toke, menglong.dong, emil, Sun Jian Generic XDP devmap multi redirect can leave cloned skbs sharing packet data. When a devmap egress program mutates packet data, another destination sharing the same data may observe that mutation. Fix this by making cloned skbs private before running the generic devmap egress program. Add selftest coverage for the last-destination case, where the final destination runs on the original skb while earlier destinations use cloned skbs. --- v4: * Preserve mac_len after skb_copy(). * Separate errno return from XDP action output in dev_map_bpf_prog_run_skb(). * Zero-initialize net_config in the new selftest. v3: https://lore.kernel.org/bpf/20260611043317.512843-1-sun.jian.kdev@gmail.com/T/#m77fde65437b0398cb470649d4f1feeb85bc134bc * Split the kernel fix and selftest into separate patches. * Move the private-copy logic into dev_map_bpf_prog_run_skb(). * Use deterministic DEVMAP_HASH keys in the last-destination selftest. * Fix the Fixes tag. v2: https://lore.kernel.org/bpf/08c35c70-a59e-4e0e-91db-22b5ec30b611@linux.dev/T/#m613671b243e8651c47f84bb329f0ea668974c549 * Move the private-copy step into dev_map_generic_redirect() so the last-destination path is covered as well. * Use skb_copy() instead of skb_unshare() to keep caller ownership unchanged on allocation failure. * Add a generic XDP last-destination selftest case. v1: https://lore.kernel.org/bpf/CABFUUZFimdrZdq=NWi+N-0sJZWvMwY=f4iF6-3TVMS8=m07Zmw@mail.gmail.com/ Sun Jian (2): bpf: Run generic devmap egress prog on private skb selftests/bpf: Cover generic devmap egress last-dst rewrite kernel/bpf/devmap.c | 41 +++-- .../selftests/bpf/prog_tests/test_xdp_veth.c | 152 +++++++++++++++++- 2 files changed, 180 insertions(+), 13 deletions(-) base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d -- 2.43.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb 2026-06-11 8:08 [PATCH bpf v4 0/2] bpf: Fix generic devmap egress skb sharing Sun Jian @ 2026-06-11 8:08 ` Sun Jian 2026-06-11 8:30 ` Toke Høiland-Jørgensen 2026-06-11 8:08 ` [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite Sun Jian 1 sibling, 1 reply; 6+ messages in thread From: Sun Jian @ 2026-06-11 8:08 UTC (permalink / raw) To: bpf Cc: netdev, linux-kernel, linux-kselftest, ast, daniel, andrii, martin.lau, davem, kuba, hawk, john.fastabend, sdf, shuah, jiayuan.chen, toke, menglong.dong, emil, Sun Jian Generic XDP devmap multi redirect uses skb_clone() for intermediate destinations and sends the last destination with the original skb. This can leave multiple destinations sharing the same packet data. This becomes visible after generic devmap egress-program support was added: a devmap egress program may mutate packet data, and another destination sharing the same data can observe that mutation. Native XDP broadcast redirect does not have this issue because xdpf_clone() copies the frame data for each destination. Generic XDP should provide the same per-destination isolation before running a devmap egress program. Fix this by making cloned skbs private before running the generic devmap egress program. Use skb_copy() instead of skb_unshare() so allocation failure does not consume the skb and the existing caller error paths keep their ownership semantics. Fixes: 2ea5eabaf04a ("bpf: devmap: Implement devmap prog execution for generic XDP") Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev> Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> --- kernel/bpf/devmap.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index cc0a43ebab6b..a3d6c60dbddb 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -512,35 +512,52 @@ static inline int __xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf, return 0; } -static u32 dev_map_bpf_prog_run_skb(struct sk_buff *skb, struct bpf_dtab_netdev *dst) +static int dev_map_bpf_prog_run_skb(struct sk_buff **pskb, + struct bpf_dtab_netdev *dst, + u32 *act) { + struct sk_buff *skb = *pskb; struct xdp_txq_info txq = { .dev = dst->dev }; struct xdp_buff xdp; - u32 act; - if (!dst->xdp_prog) - return XDP_PASS; + if (!dst->xdp_prog) { + *act = XDP_PASS; + return 0; + } + + if (skb_cloned(skb)) { + struct sk_buff *nskb; + + nskb = skb_copy(skb, GFP_ATOMIC); + if (!nskb) + return -ENOMEM; + + nskb->mac_len = skb->mac_len; + consume_skb(skb); + skb = nskb; + *pskb = nskb; + } __skb_pull(skb, skb->mac_len); xdp.txq = &txq; - act = bpf_prog_run_generic_xdp(skb, &xdp, dst->xdp_prog); - switch (act) { + *act = bpf_prog_run_generic_xdp(skb, &xdp, dst->xdp_prog); + switch (*act) { case XDP_PASS: __skb_push(skb, skb->mac_len); break; default: - bpf_warn_invalid_xdp_action(NULL, dst->xdp_prog, act); + bpf_warn_invalid_xdp_action(NULL, dst->xdp_prog, *act); fallthrough; case XDP_ABORTED: - trace_xdp_exception(dst->dev, dst->xdp_prog, act); + trace_xdp_exception(dst->dev, dst->xdp_prog, *act); fallthrough; case XDP_DROP: kfree_skb(skb); break; } - return act; + return 0; } int dev_xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf, @@ -700,6 +717,7 @@ int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx, int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, const struct bpf_prog *xdp_prog) { + u32 act; int err; err = xdp_ok_fwd_dev(dst->dev, skb->len); @@ -710,7 +728,10 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, * return 0 even if packet is dropped. Helper below takes care of * freeing skb. */ - if (dev_map_bpf_prog_run_skb(skb, dst) != XDP_PASS) + err = dev_map_bpf_prog_run_skb(&skb, dst, &act); + if (err) + return err; + if (act != XDP_PASS) return 0; skb->dev = dst->dev; -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb 2026-06-11 8:08 ` [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb Sun Jian @ 2026-06-11 8:30 ` Toke Høiland-Jørgensen 2026-06-11 9:02 ` sun jian 0 siblings, 1 reply; 6+ messages in thread From: Toke Høiland-Jørgensen @ 2026-06-11 8:30 UTC (permalink / raw) To: Sun Jian, bpf Cc: netdev, linux-kernel, linux-kselftest, ast, daniel, andrii, martin.lau, davem, kuba, hawk, john.fastabend, sdf, shuah, jiayuan.chen, menglong.dong, emil, Sun Jian Sun Jian <sun.jian.kdev@gmail.com> writes: > Generic XDP devmap multi redirect uses skb_clone() for intermediate > destinations and sends the last destination with the original skb. This > can leave multiple destinations sharing the same packet data. > > This becomes visible after generic devmap egress-program support was > added: a devmap egress program may mutate packet data, and another > destination sharing the same data can observe that mutation. > > Native XDP broadcast redirect does not have this issue because > xdpf_clone() copies the frame data for each destination. Generic XDP > should provide the same per-destination isolation before running a > devmap egress program. > > Fix this by making cloned skbs private before running the generic devmap > egress program. Use skb_copy() instead of skb_unshare() so allocation > failure does not consume the skb and the existing caller error paths keep > their ownership semantics. > > Fixes: 2ea5eabaf04a ("bpf: devmap: Implement devmap prog execution for generic XDP") > Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev> > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> > Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> > --- > kernel/bpf/devmap.c | 41 +++++++++++++++++++++++++++++++---------- > 1 file changed, 31 insertions(+), 10 deletions(-) > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index cc0a43ebab6b..a3d6c60dbddb 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -512,35 +512,52 @@ static inline int __xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf, > return 0; > } > > -static u32 dev_map_bpf_prog_run_skb(struct sk_buff *skb, struct bpf_dtab_netdev *dst) > +static int dev_map_bpf_prog_run_skb(struct sk_buff **pskb, > + struct bpf_dtab_netdev *dst, > + u32 *act) > { > + struct sk_buff *skb = *pskb; > struct xdp_txq_info txq = { .dev = dst->dev }; > struct xdp_buff xdp; > - u32 act; > > - if (!dst->xdp_prog) > - return XDP_PASS; > + if (!dst->xdp_prog) { > + *act = XDP_PASS; > + return 0; > + } > + > + if (skb_cloned(skb)) { > + struct sk_buff *nskb; > + > + nskb = skb_copy(skb, GFP_ATOMIC); > + if (!nskb) > + return -ENOMEM; > + > + nskb->mac_len = skb->mac_len; > + consume_skb(skb); > + skb = nskb; > + *pskb = nskb; > + } So with all this pointer soup and back and forth, the version you had in v2 (with the check and skb_copy() in dev_map_generic_redirect()) was much cleaner :/ -Toke ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb 2026-06-11 8:30 ` Toke Høiland-Jørgensen @ 2026-06-11 9:02 ` sun jian 0 siblings, 0 replies; 6+ messages in thread From: sun jian @ 2026-06-11 9:02 UTC (permalink / raw) To: Toke Høiland-Jørgensen Cc: bpf, netdev, linux-kernel, linux-kselftest, ast, daniel, andrii, martin.lau, davem, kuba, hawk, john.fastabend, sdf, shuah, jiayuan.chen, menglong.dong, emil On Thu, Jun 11, 2026 at 4:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Sun Jian <sun.jian.kdev@gmail.com> writes: > > > Generic XDP devmap multi redirect uses skb_clone() for intermediate > > destinations and sends the last destination with the original skb. This > > can leave multiple destinations sharing the same packet data. > > > > This becomes visible after generic devmap egress-program support was > > added: a devmap egress program may mutate packet data, and another > > destination sharing the same data can observe that mutation. > > > > Native XDP broadcast redirect does not have this issue because > > xdpf_clone() copies the frame data for each destination. Generic XDP > > should provide the same per-destination isolation before running a > > devmap egress program. > > > > Fix this by making cloned skbs private before running the generic devmap > > egress program. Use skb_copy() instead of skb_unshare() so allocation > > failure does not consume the skb and the existing caller error paths keep > > their ownership semantics. > > > > Fixes: 2ea5eabaf04a ("bpf: devmap: Implement devmap prog execution for generic XDP") > > Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev> > > Suggested-by: Jakub Kicinski <kuba@kernel.org> > > Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> > > Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> > > --- > > kernel/bpf/devmap.c | 41 +++++++++++++++++++++++++++++++---------- > > 1 file changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > > index cc0a43ebab6b..a3d6c60dbddb 100644 > > --- a/kernel/bpf/devmap.c > > +++ b/kernel/bpf/devmap.c > > @@ -512,35 +512,52 @@ static inline int __xdp_enqueue(struct net_device *dev, struct xdp_frame *xdpf, > > return 0; > > } > > > > -static u32 dev_map_bpf_prog_run_skb(struct sk_buff *skb, struct bpf_dtab_netdev *dst) > > +static int dev_map_bpf_prog_run_skb(struct sk_buff **pskb, > > + struct bpf_dtab_netdev *dst, > > + u32 *act) > > { > > + struct sk_buff *skb = *pskb; > > struct xdp_txq_info txq = { .dev = dst->dev }; > > struct xdp_buff xdp; > > - u32 act; > > > > - if (!dst->xdp_prog) > > - return XDP_PASS; > > + if (!dst->xdp_prog) { > > + *act = XDP_PASS; > > + return 0; > > + } > > + > > + if (skb_cloned(skb)) { > > + struct sk_buff *nskb; > > + > > + nskb = skb_copy(skb, GFP_ATOMIC); > > + if (!nskb) > > + return -ENOMEM; > > + > > + nskb->mac_len = skb->mac_len; > > + consume_skb(skb); > > + skb = nskb; > > + *pskb = nskb; > > + } > > So with all this pointer soup and back and forth, the version you had in > v2 (with the check and skb_copy() in dev_map_generic_redirect()) was > much cleaner :/ > > -Toke > Hi Toke, Thanks, that makes sense. I agree that v4 made the helper interface less clean. I'll wait for more feedback before respinning. Sun Jian ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite 2026-06-11 8:08 [PATCH bpf v4 0/2] bpf: Fix generic devmap egress skb sharing Sun Jian 2026-06-11 8:08 ` [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb Sun Jian @ 2026-06-11 8:08 ` Sun Jian 2026-06-11 8:24 ` sashiko-bot 1 sibling, 1 reply; 6+ messages in thread From: Sun Jian @ 2026-06-11 8:08 UTC (permalink / raw) To: bpf Cc: netdev, linux-kernel, linux-kselftest, ast, daniel, andrii, martin.lau, davem, kuba, hawk, john.fastabend, sdf, shuah, jiayuan.chen, toke, menglong.dong, emil, Sun Jian Strengthen xdp_veth_egress to check that each destination observes the MAC selected for its own egress ifindex, instead of only checking that the observed MAC differs from a single magic value. Add a generic XDP last-destination test where earlier destinations do not have a devmap egress program while the final destination does. This covers the case where the final destination runs on the original skb and could otherwise rewrite packet data still shared with earlier cloned skbs. Use deterministic DEVMAP_HASH keys for the egress map so the intended last destination is stable. Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> --- .../selftests/bpf/prog_tests/test_xdp_veth.c | 152 +++++++++++++++++- 1 file changed, 149 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c index 3e98a1665936..0fa2cf835c62 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c +++ b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c @@ -456,7 +456,11 @@ static void xdp_veth_egress(u32 flags) .remote_flags = flags, } }; - const char magic_mac[6] = { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0xFF}; + const unsigned char egress_macs[VETH_PAIRS_COUNT][ETH_ALEN] = { + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x01 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x02 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x03 }, + }; struct xdp_redirect_multi_kern *xdp_redirect_multi_kern; struct bpf_object *bpf_objs[VETH_EGRESS_SKEL_NB]; struct xdp_redirect_map *xdp_redirect_map; @@ -512,7 +516,7 @@ static void xdp_veth_egress(u32 flags) &net_config, prog_cfg, i)) goto destroy_xdp_redirect_map; - err = bpf_map_update_elem(mac_map, &ifindex, magic_mac, 0); + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); if (!ASSERT_OK(err, "bpf_map_update_elem")) goto destroy_xdp_redirect_map; @@ -530,16 +534,155 @@ static void xdp_veth_egress(u32 flags) goto destroy_xdp_redirect_map; for (i = 0; i < 2; i++) { + u32 key = i; + __be64 expected = 0; + u64 res; + + err = bpf_map_lookup_elem(res_map, &key, &res); + if (!ASSERT_OK(err, "get MAC res")) + goto destroy_xdp_redirect_map; + + /* store_mac_1/2 run on the second/third remote veths. */ + memcpy(&expected, egress_macs[i + 1], ETH_ALEN); + ASSERT_EQ(res, expected, "compare mac"); + } + +destroy_xdp_redirect_map: + close_netns(nstoken); + xdp_redirect_map__destroy(xdp_redirect_map); +destroy_xdp_redirect_multi_kern: + xdp_redirect_multi_kern__destroy(xdp_redirect_multi_kern); +destroy_xdp_dummy: + xdp_dummy__destroy(xdp_dummy); + + cleanup_network(&net_config); +} + +static void xdp_veth_egress_last_dst(u32 flags) +{ + struct prog_configuration prog_cfg[VETH_PAIRS_COUNT] = { + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "store_mac_1", + .local_flags = flags, + .remote_flags = flags, + }, + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "store_mac_2", + .local_flags = flags, + .remote_flags = flags, + }, + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "xdp_dummy_prog", + .local_flags = flags, + .remote_flags = flags, + } + }; + const unsigned char egress_macs[VETH_PAIRS_COUNT][ETH_ALEN] = { + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x01 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x02 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x03 }, + }; + struct xdp_redirect_multi_kern *xdp_redirect_multi_kern; + struct bpf_object *bpf_objs[VETH_EGRESS_SKEL_NB]; + struct xdp_redirect_map *xdp_redirect_map; + struct net_configuration net_config = {}; + int mac_map, egress_map, res_map; + struct nstoken *nstoken = NULL; + struct xdp_dummy *xdp_dummy; + __be64 last_mac = 0; + bool found = false; + int err; + int i; + + xdp_dummy = xdp_dummy__open_and_load(); + if (!ASSERT_OK_PTR(xdp_dummy, "xdp_dummy__open_and_load")) + return; + + xdp_redirect_multi_kern = xdp_redirect_multi_kern__open_and_load(); + if (!ASSERT_OK_PTR(xdp_redirect_multi_kern, "xdp_redirect_multi_kern__open_and_load")) + goto destroy_xdp_dummy; + + xdp_redirect_map = xdp_redirect_map__open_and_load(); + if (!ASSERT_OK_PTR(xdp_redirect_map, "xdp_redirect_map__open_and_load")) + goto destroy_xdp_redirect_multi_kern; + + if (!ASSERT_OK(create_network(&net_config), "create network")) + goto destroy_xdp_redirect_map; + + mac_map = bpf_map__fd(xdp_redirect_multi_kern->maps.mac_map); + if (!ASSERT_OK_FD(mac_map, "open mac_map")) + goto destroy_xdp_redirect_map; + + egress_map = bpf_map__fd(xdp_redirect_multi_kern->maps.map_egress); + if (!ASSERT_OK_FD(egress_map, "open map_egress")) + goto destroy_xdp_redirect_map; + + bpf_objs[0] = xdp_dummy->obj; + bpf_objs[1] = xdp_redirect_multi_kern->obj; + bpf_objs[2] = xdp_redirect_map->obj; + + nstoken = open_netns(net_config.ns0_name); + if (!ASSERT_OK_PTR(nstoken, "open NS0")) + goto destroy_xdp_redirect_map; + + for (i = 0; i < VETH_PAIRS_COUNT; i++) { + struct bpf_devmap_val devmap_val = {}; + int ifindex = if_nametoindex(net_config.veth_cfg[i].local_veth); + u32 key = i; + + SYS(destroy_xdp_redirect_map, + "ip -n %s neigh add %s lladdr 00:00:00:00:00:01 dev %s", + net_config.veth_cfg[i].namespace, IP_NEIGH, + net_config.veth_cfg[i].remote_veth); + + if (attach_programs_to_veth_pair(bpf_objs, VETH_EGRESS_SKEL_NB, + &net_config, prog_cfg, i)) + goto destroy_xdp_redirect_map; + + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_xdp_redirect_map; + + devmap_val.ifindex = ifindex; + devmap_val.bpf_prog.fd = -1; + + if (i == VETH_PAIRS_COUNT - 1) + devmap_val.bpf_prog.fd = + bpf_program__fd(xdp_redirect_multi_kern->progs.xdp_devmap_prog); + + err = bpf_map_update_elem(egress_map, &key, &devmap_val, 0); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_xdp_redirect_map; + } + + SYS_NOFAIL("ip netns exec %s ping %s -i 0.1 -c 4 -W1 > /dev/null ", + net_config.veth_cfg[0].namespace, IP_NEIGH); + + res_map = bpf_map__fd(xdp_redirect_map->maps.rx_mac); + if (!ASSERT_OK_FD(res_map, "open rx_map")) + goto destroy_xdp_redirect_map; + + memcpy(&last_mac, egress_macs[VETH_PAIRS_COUNT - 1], ETH_ALEN); + + for (i = 0; i < VETH_PAIRS_COUNT - 1; i++) { u32 key = i; u64 res; err = bpf_map_lookup_elem(res_map, &key, &res); + if (err == -ENOENT) + continue; if (!ASSERT_OK(err, "get MAC res")) goto destroy_xdp_redirect_map; - ASSERT_STRNEQ((const char *)&res, magic_mac, ETH_ALEN, "compare mac"); + found = true; + ASSERT_NEQ(res, last_mac, "compare last dst mac"); } + ASSERT_TRUE(found, "found earlier dst mac"); + destroy_xdp_redirect_map: close_netns(nstoken); xdp_redirect_map__destroy(xdp_redirect_map); @@ -596,4 +739,7 @@ void test_xdp_veth_egress(void) if (test__start_subtest("SKB_MODE/egress")) xdp_veth_egress(XDP_FLAGS_SKB_MODE); + + if (test__start_subtest("SKB_MODE/egress_last_dst")) + xdp_veth_egress_last_dst(XDP_FLAGS_SKB_MODE); } -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite 2026-06-11 8:08 ` [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite Sun Jian @ 2026-06-11 8:24 ` sashiko-bot 0 siblings, 0 replies; 6+ messages in thread From: sashiko-bot @ 2026-06-11 8:24 UTC (permalink / raw) To: Sun Jian; +Cc: bpf Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider: New issues: - [High] bpf_map_lookup_elem on an ARRAY map never returns -ENOENT, masking dropped packets and causing a false positive test pass. - [Low] The test performs manual FD validation (`ASSERT_OK_FD`) on map file descriptors immediately after successfully loading the skeleton, violating the explicit BPF selftest guidelines. Pre-existing issues: - [Critical] Uninitialized stack structure `net_config` is passed to cleanup functions on early error paths, causing arbitrary shell command execution via stack garbage. - [High] bpf_map_update_elem reads 8 bytes from a 6-byte user-space array, causing an out-of-bounds stack read. -- commit 89d44c6ff3ea8b2a422a70685fda70fa6ebcbf9c Author: Sun Jian <sun.jian.kdev@gmail.com> selftests/bpf: Cover generic devmap egress last-dst rewrite Strengthen xdp_veth_egress to check that each destination observes the MAC selected for its own egress ifindex, instead of only checking that the observed MAC differs from a single magic value. Add a generic XDP last-destination test where earlier destinations do not have a devmap egress program while the final destination does. > diff --git a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > index 3e98a16659362..0fa2cf835c629 100644 > --- a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > +++ b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > @@ -456,7 +456,11 @@ static void xdp_veth_egress(u32 flags) > .remote_flags = flags, > } > }; > - const char magic_mac[6] = { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0xFF}; > + const unsigned char egress_macs[VETH_PAIRS_COUNT][ETH_ALEN] = { > + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x01 }, > + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x02 }, > + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x03 }, > + }; [Severity: Critical] This is a pre-existing issue, but does the error handling in xdp_veth_egress() execute arbitrary shell commands? In xdp_veth_egress(), net_config is declared on the stack without initialization. If an early failure occurs (such as xdp_dummy__open_and_load() failing), the error path calls cleanup_network(&net_config). This cleanup function invokes SYS_NOFAIL("ip netns del %s", net_config.ns0_name), which will execute a command using whatever stack garbage is in the uninitialized ns0_name field. The new xdp_veth_egress_last_dst() function added in this patch correctly uses net_config = {} to avoid this problem. [ ... ] > @@ -512,7 +516,7 @@ static void xdp_veth_egress(u32 flags) > &net_config, prog_cfg, i)) > goto destroy_xdp_redirect_map; > > - err = bpf_map_update_elem(mac_map, &ifindex, magic_mac, 0); > + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); [Severity: High] This is a pre-existing issue duplicated by this patch, but does this bpf_map_update_elem() call read out of bounds? The mac_map is defined in xdp_redirect_multi_kern.c with a value_size of 8 bytes (__be64), but egress_macs elements are only 6 bytes (ETH_ALEN). For the final element in the array, this operation will read 2 bytes past the end of the egress_macs stack array into stack garbage. [ ... ] > +static void xdp_veth_egress_last_dst(u32 flags) > +{ [ ... ] > + mac_map = bpf_map__fd(xdp_redirect_multi_kern->maps.mac_map); > + if (!ASSERT_OK_FD(mac_map, "open mac_map")) > + goto destroy_xdp_redirect_map; > + > + egress_map = bpf_map__fd(xdp_redirect_multi_kern->maps.map_egress); > + if (!ASSERT_OK_FD(egress_map, "open map_egress")) > + goto destroy_xdp_redirect_map; [Severity: Low] Is this FD validation necessary? According to BPF selftest guidelines, after a successful xdp_redirect_multi_kern__open_and_load(), all maps are guaranteed to be created with valid FDs. Manual FD validation like this is not needed when using skeleton-generated fields. [ ... ] > + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); > + if (!ASSERT_OK(err, "bpf_map_update_elem")) > + goto destroy_xdp_redirect_map; [Severity: High] Does this bpf_map_update_elem() call also read out of bounds, similar to the pre-existing issue above? The map expects an 8-byte value (__be64), but the egress_macs array element provides only 6 bytes. [ ... ] > + for (i = 0; i < VETH_PAIRS_COUNT - 1; i++) { > u32 key = i; > u64 res; > > err = bpf_map_lookup_elem(res_map, &key, &res); > + if (err == -ENOENT) > + continue; > if (!ASSERT_OK(err, "get MAC res")) > goto destroy_xdp_redirect_map; > > - ASSERT_STRNEQ((const char *)&res, magic_mac, ETH_ALEN, "compare mac"); > + found = true; > + ASSERT_NEQ(res, last_mac, "compare last dst mac"); > } [Severity: High] Can bpf_map_lookup_elem() actually return -ENOENT for this map type? Looking at xdp_redirect_map.c, rx_mac is a BPF_MAP_TYPE_ARRAY. Since array maps pre-allocate their elements and zero-initialize them, lookups for valid indices will always succeed (returning 0) and yield the current element value (0 if unwritten), rather than returning -ENOENT. If a packet is dropped and never updates the map, the value will remain 0. The bpf_map_lookup_elem() call will succeed (err == 0), and the test will trivially pass because res (0) is not equal to last_mac. Does this mask datapath failures and cause a false positive test pass? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260611080850.536996-1-sun.jian.kdev@gmail.com?part=2 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-11 9:02 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-11 8:08 [PATCH bpf v4 0/2] bpf: Fix generic devmap egress skb sharing Sun Jian 2026-06-11 8:08 ` [PATCH bpf v4 1/2] bpf: Run generic devmap egress prog on private skb Sun Jian 2026-06-11 8:30 ` Toke Høiland-Jørgensen 2026-06-11 9:02 ` sun jian 2026-06-11 8:08 ` [PATCH bpf v4 2/2] selftests/bpf: Cover generic devmap egress last-dst rewrite Sun Jian 2026-06-11 8:24 ` sashiko-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox