From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45A723C109B for ; Wed, 10 Jun 2026 10:29:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087364; cv=none; b=oHPob8diyYw+gD1vYvZ5/WsYklbMrWc7p4qHXgDDepfPDWBcEq1cAohmczYugkRcY7l82UYNMwznKWR9WVl6ExNIm694n2oBR6LXUB5y9QOXdWsE5q23ND84l6ngDxpJU392YVAVnKMfoDk/wepyac1Quv3k+v10tXMAuY/wjrk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781087364; c=relaxed/simple; bh=QPtDcvibkTTtCzwOD8YF+IlZt5VnWw/gaGRrHJJssZE=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=f3iS/e81iBvS8uznd62vaPSTWD522Jfhil8pceD3s3r6SlXC0HNn+wqF/BwXL9qVYp2eO6s1woBWnyeDK/SQ1IOFnqgUAR5z9tCRx6jDHOPTXtLU8nZ/mVeDoblqpGjVj5SzblmuEcIg1xMenbGgA3jGzi+6VmtsYpxBIfGJqTo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=k5/gEh4B; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="k5/gEh4B" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-36d630c0e35so7113212a91.3 for ; Wed, 10 Jun 2026 03:29:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781087362; x=1781692162; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=AVL5w877GoMWOO5dFwbFSHOEuKXJFw4fjgo6QHpTzSQ=; b=k5/gEh4B626KnJJ0WJQ1JA5aK9nxDsIbsuvTp1m2chfxDzDdP0Nd9WWlvLbobENyFR rqO+K0XSfS/rCqTWbXRIgz6HFJLG/VqadFbyyre0SUyDGWsFAimTcn2+c51i/ocydOkP bC1zVByKzIWSBdzE84Ps4SYeXHbFO+frhX60NIeSLRqNMgCM7zpgJGoK7xhtY9RpoirY XKb8MYXVH+I547OeKU4IxxpBx77lnUXsYDDxbiqW3jpDTkm3PscrXmz/xSMO+eOelr2f 4vjT3aIb9z7mVr6S6bzrYkZYa+cDn4vxJ8FWKTi+y1WIL7WJs0dZnCmseyHZ+ylVbHXd XgdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781087362; x=1781692162; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AVL5w877GoMWOO5dFwbFSHOEuKXJFw4fjgo6QHpTzSQ=; b=iMkbGpYiofiWX88SAFIzzJpsr679aX186q96Io2+layr3fg5DZ00C4P+ebAfzUKudx +PKsnVRBPfckzxyMWu/uIyoPWQVo/ELEtOgT8ydlvobUqyzzxuIqGSwj+W3tF9MQBOLr nKMEd7zboryWCENnjbby+P7I9UvXRxbmz1drnmLbHESnnKOuO675u67zygwF8XJecLn/ GJ9KOJL6jXA6sxCQ/wvjv+8DZ4Wjk4B4/AnUeXQKPH6YsLXB/2bzKCzOHY07eCOxeFHq t7bGWoBuTJNmOclnwHEdpiHAg5E6T/Hf7Jx53ENxECUQ08vsQ22Bgoy9wTJNTru2gl9K nbCg== X-Forwarded-Encrypted: i=1; AFNElJ8LtfxQAWZO5PuTBvHkm903vyO+63F1AnFpsrEAUQBm0J+SAQganz3mLgt+plaG2xwNtQzVRV4=@vger.kernel.org X-Gm-Message-State: AOJu0YwjE2rP4jES/TLEEVvns+X39PJP+Xp0ip8N+lOLny2PsimONElx HigyLiIhZo2UB+nnJ5q+9xHlAXYWVZoksz8yVIfzb9a22YANRB6JaZWu X-Gm-Gg: Acq92OFFy9tNgKx7LHs6JhYVT5AAMKP7ftjycRjmQZ3AuSHBQyOTld/Z+BeRmL8ARrX L/2FLUuzecCcW2ij/YBElDWmJR4pKoch0HgLSiEcBsk0MgJsiOrG9t6yJMPUnzla1tUkbEbpB5t /vFeElirXHqSa19+4r4+cXBB20bUenJpMvthMWnt1yjfRZc3nCh2p+cmEkJWTpQpMR1nAwo0dUp I4n5jstsN0k4tV63aiOsOtmHTeybeX3cNalw5nDVwvMyZnu/I8qYS1KQDn4ciCxV1nGEBwQEm6i oveIl2GLSBkAsNKHl4nntxg8DKHeHdlPKQ6GKq5gLyPO9TwiK2QFcu6BQ5CTYHWUptKadnQf2cX 1xXxs45JmoxGqexO7iHZbe02pcek86ALH3aVlXRKEUdCkcOm/ehZgn524Fz/1AfnZSONmhM9x29 e3zg08ubwXGSKKctEwhkCyjIK3xG2g1rZ0JYOsVNie6BPvILcny2gw5CMTDJbes1t3uAgqrnp5d eIpVa0d2jkb X-Received: by 2002:a17:90b:380a:b0:35b:8d89:719b with SMTP id 98e67ed59e1d1-370ee344da3mr24890606a91.1.1781087361603; Wed, 10 Jun 2026 03:29:21 -0700 (PDT) Received: from localhost.localdomain ([2a12:a304:100::109f]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-376460ee14fsm1855460a91.1.2026.06.10.03.29.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jun 2026 03:29:21 -0700 (PDT) From: Sun Jian To: bpf@vger.kernel.org Cc: Menglong Dong , Emil Tsalapatis , Sun Jian , Jiayuan Chen , Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Stanislav Fomichev , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Kumar Kartikeya Dwivedi , Song Liu , Yonghong Song , Jiri Olsa , Shuah Khan , Hangbin Liu , =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH bpf v2] bpf: Run generic devmap egress prog on private skb Date: Wed, 10 Jun 2026 18:28:49 +0800 Message-ID: <20260610102850.483291-1-sun.jian.kdev@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Generic XDP devmap multi redirect uses skb_clone() for the intermediate destinations and sends the last destination with the original skb. This can leave multiple destinations sharing the same packet data. This becomes visible when a devmap egress program mutates packet data. One destination can observe changes made for another destination. The last-destination path has the same problem: the last destination runs on the original skb, so its egress program can modify packet data still shared with earlier cloned skbs. Native XDP broadcast redirect does not have this issue because xdpf_clone() copies the frame data for each destination. Generic XDP should provide the same per-destination isolation before running a devmap egress program. Fix this by making cloned skbs private in dev_map_generic_redirect() before running the devmap egress program. Use skb_copy() instead of skb_unshare() so that allocation failure does not consume the skb and the existing caller error paths keep their ownership semantics. Add a selftest that covers the last-destination case where earlier destinations do not have a devmap egress program, while the final destination does. Tested with: ./test_progs -t xdp_veth_egress ./test_progs -t xdp_veth ./test_progs -t xdp Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support") Suggested-by: Jiayuan Chen Signed-off-by: Sun Jian --- v1: https://lore.kernel.org/bpf/CABFUUZFimdrZdq=NWi+N-0sJZWvMwY=f4iF6-3TVMS8=m07Zmw@mail.gmail.com/ Changes in v2: - Move the private-copy step into dev_map_generic_redirect() so the last-destination path is covered as well. - Use skb_copy() instead of skb_unshare() to keep caller ownership unchanged on allocation failure. - Add a generic XDP last-destination selftest case. kernel/bpf/devmap.c | 10 ++ .../selftests/bpf/prog_tests/test_xdp_veth.c | 151 +++++++++++++++++- 2 files changed, 158 insertions(+), 3 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index cc0a43ebab6b..59f267685bc6 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -700,12 +700,22 @@ int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx, int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, const struct bpf_prog *xdp_prog) { + struct sk_buff *nskb; int err; err = xdp_ok_fwd_dev(dst->dev, skb->len); if (unlikely(err)) return err; + if (dst->xdp_prog && skb_cloned(skb)) { + nskb = skb_copy(skb, GFP_ATOMIC); + if (!nskb) + return -ENOMEM; + + consume_skb(skb); + skb = nskb; + } + /* Redirect has already succeeded semantically at this point, so we just * return 0 even if packet is dropped. Helper below takes care of * freeing skb. diff --git a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c index 3e98a1665936..1f0b9ade12fe 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c +++ b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c @@ -456,7 +456,11 @@ static void xdp_veth_egress(u32 flags) .remote_flags = flags, } }; - const char magic_mac[6] = { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0xFF}; + const unsigned char egress_macs[VETH_PAIRS_COUNT][ETH_ALEN] = { + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x01 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x02 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x03 }, + }; struct xdp_redirect_multi_kern *xdp_redirect_multi_kern; struct bpf_object *bpf_objs[VETH_EGRESS_SKEL_NB]; struct xdp_redirect_map *xdp_redirect_map; @@ -512,7 +516,7 @@ static void xdp_veth_egress(u32 flags) &net_config, prog_cfg, i)) goto destroy_xdp_redirect_map; - err = bpf_map_update_elem(mac_map, &ifindex, magic_mac, 0); + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); if (!ASSERT_OK(err, "bpf_map_update_elem")) goto destroy_xdp_redirect_map; @@ -531,13 +535,16 @@ static void xdp_veth_egress(u32 flags) for (i = 0; i < 2; i++) { u32 key = i; + __be64 expected = 0; u64 res; err = bpf_map_lookup_elem(res_map, &key, &res); if (!ASSERT_OK(err, "get MAC res")) goto destroy_xdp_redirect_map; - ASSERT_STRNEQ((const char *)&res, magic_mac, ETH_ALEN, "compare mac"); + /* store_mac_1/2 run on the second/third remote veths. */ + memcpy(&expected, egress_macs[i + 1], ETH_ALEN); + ASSERT_EQ(res, expected, "compare mac"); } destroy_xdp_redirect_map: @@ -551,6 +558,141 @@ static void xdp_veth_egress(u32 flags) cleanup_network(&net_config); } +static void xdp_veth_egress_last_dst(u32 flags) +{ + struct prog_configuration prog_cfg[VETH_PAIRS_COUNT] = { + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "store_mac_1", + .local_flags = flags, + .remote_flags = flags, + }, + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "store_mac_2", + .local_flags = flags, + .remote_flags = flags, + }, + { + .local_name = "xdp_redirect_map_all_prog", + .remote_name = "xdp_dummy_prog", + .local_flags = flags, + .remote_flags = flags, + } + }; + const unsigned char egress_macs[VETH_PAIRS_COUNT][ETH_ALEN] = { + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x01 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x02 }, + { 0xAA, 0xBB, 0xCC, 0xDD, 0xEE, 0x03 }, + }; + struct xdp_redirect_multi_kern *xdp_redirect_multi_kern; + struct bpf_object *bpf_objs[VETH_EGRESS_SKEL_NB]; + struct xdp_redirect_map *xdp_redirect_map; + struct net_configuration net_config; + int mac_map, egress_map, res_map; + struct nstoken *nstoken = NULL; + struct xdp_dummy *xdp_dummy; + __be64 last_mac = 0; + bool found = false; + int err; + int i; + + xdp_dummy = xdp_dummy__open_and_load(); + if (!ASSERT_OK_PTR(xdp_dummy, "xdp_dummy__open_and_load")) + return; + + xdp_redirect_multi_kern = xdp_redirect_multi_kern__open_and_load(); + if (!ASSERT_OK_PTR(xdp_redirect_multi_kern, "xdp_redirect_multi_kern__open_and_load")) + goto destroy_xdp_dummy; + + xdp_redirect_map = xdp_redirect_map__open_and_load(); + if (!ASSERT_OK_PTR(xdp_redirect_map, "xdp_redirect_map__open_and_load")) + goto destroy_xdp_redirect_multi_kern; + + if (!ASSERT_OK(create_network(&net_config), "create network")) + goto destroy_xdp_redirect_map; + + mac_map = bpf_map__fd(xdp_redirect_multi_kern->maps.mac_map); + if (!ASSERT_OK_FD(mac_map, "open mac_map")) + goto destroy_xdp_redirect_map; + + egress_map = bpf_map__fd(xdp_redirect_multi_kern->maps.map_egress); + if (!ASSERT_OK_FD(egress_map, "open map_egress")) + goto destroy_xdp_redirect_map; + + bpf_objs[0] = xdp_dummy->obj; + bpf_objs[1] = xdp_redirect_multi_kern->obj; + bpf_objs[2] = xdp_redirect_map->obj; + + nstoken = open_netns(net_config.ns0_name); + if (!ASSERT_OK_PTR(nstoken, "open NS0")) + goto destroy_xdp_redirect_map; + + for (i = 0; i < VETH_PAIRS_COUNT; i++) { + struct bpf_devmap_val devmap_val = {}; + int ifindex = if_nametoindex(net_config.veth_cfg[i].local_veth); + + SYS(destroy_xdp_redirect_map, + "ip -n %s neigh add %s lladdr 00:00:00:00:00:01 dev %s", + net_config.veth_cfg[i].namespace, IP_NEIGH, + net_config.veth_cfg[i].remote_veth); + + if (attach_programs_to_veth_pair(bpf_objs, VETH_EGRESS_SKEL_NB, + &net_config, prog_cfg, i)) + goto destroy_xdp_redirect_map; + + err = bpf_map_update_elem(mac_map, &ifindex, egress_macs[i], 0); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_xdp_redirect_map; + + devmap_val.ifindex = ifindex; + devmap_val.bpf_prog.fd = -1; + + if (i == VETH_PAIRS_COUNT - 1) + devmap_val.bpf_prog.fd = + bpf_program__fd(xdp_redirect_multi_kern->progs.xdp_devmap_prog); + + err = bpf_map_update_elem(egress_map, &ifindex, &devmap_val, 0); + if (!ASSERT_OK(err, "bpf_map_update_elem")) + goto destroy_xdp_redirect_map; + } + + SYS_NOFAIL("ip netns exec %s ping %s -i 0.1 -c 4 -W1 > /dev/null ", + net_config.veth_cfg[0].namespace, IP_NEIGH); + + res_map = bpf_map__fd(xdp_redirect_map->maps.rx_mac); + if (!ASSERT_OK_FD(res_map, "open rx_map")) + goto destroy_xdp_redirect_map; + + memcpy(&last_mac, egress_macs[VETH_PAIRS_COUNT - 1], ETH_ALEN); + + for (i = 0; i < VETH_PAIRS_COUNT - 1; i++) { + u32 key = i; + u64 res; + + err = bpf_map_lookup_elem(res_map, &key, &res); + if (err == -ENOENT) + continue; + if (!ASSERT_OK(err, "get MAC res")) + goto destroy_xdp_redirect_map; + + found = true; + ASSERT_NEQ(res, last_mac, "compare last dst mac"); + } + + ASSERT_TRUE(found, "found earlier dst mac"); + +destroy_xdp_redirect_map: + close_netns(nstoken); + xdp_redirect_map__destroy(xdp_redirect_map); +destroy_xdp_redirect_multi_kern: + xdp_redirect_multi_kern__destroy(xdp_redirect_multi_kern); +destroy_xdp_dummy: + xdp_dummy__destroy(xdp_dummy); + + cleanup_network(&net_config); +} + void test_xdp_veth_redirect(void) { if (test__start_subtest("0")) @@ -596,4 +738,7 @@ void test_xdp_veth_egress(void) if (test__start_subtest("SKB_MODE/egress")) xdp_veth_egress(XDP_FLAGS_SKB_MODE); + + if (test__start_subtest("SKB_MODE/egress_last_dst")) + xdp_veth_egress_last_dst(XDP_FLAGS_SKB_MODE); } base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d -- 2.43.0