* [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
@ 2023-01-11 8:01 Ziyang Xuan
2023-01-11 8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan
2023-01-11 8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan
0 siblings, 2 replies; 8+ messages in thread
From: Ziyang Xuan @ 2023-01-11 8:01 UTC (permalink / raw)
To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf,
netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf,
haoluo, jolsa
Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.
And add ipip6 and ip6ip decap testcases to verify that
bpf_skb_adjust_room() correctly decapsulate ipip6 and ip6ip
tunnel packets.
$./test_tc_tunnel.sh
ipip
encap 192.168.1.1 to 192.168.1.2, type ipip, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ipip6
encap 192.168.1.1 to 192.168.1.2, type ipip6, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ip6ip6
encap fd::1 to fd::2, type ip6tnl, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
sit
encap fd::1 to fd::2, type sit, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
...
OK. All tests passed
v2:
- Use decap flags to indicate the new IP header.
Do not rely on skb->encapsulation.
Ziyang Xuan (2):
bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()
selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel
include/uapi/linux/bpf.h | 8 ++
net/core/filter.c | 26 +++++-
tools/include/uapi/linux/bpf.h | 8 ++
.../selftests/bpf/progs/test_tc_tunnel.c | 91 ++++++++++++++++++-
tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +--
5 files changed, 139 insertions(+), 9 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() 2023-01-11 8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan @ 2023-01-11 8:01 ` Ziyang Xuan 2023-01-11 15:43 ` Willem de Bruijn 2023-01-12 1:26 ` Martin KaFai Lau 2023-01-11 8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan 1 sibling, 2 replies; 8+ messages in thread From: Ziyang Xuan @ 2023-01-11 8:01 UTC (permalink / raw) To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa Add ipip6 and ip6ip decap support for bpf_skb_adjust_room(). Main use case is for using cls_bpf on ingress hook to decapsulate IPv4 over IPv6 and IPv6 over IPv4 tunnel packets. Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the new IP header version after decapsulating the outer IP header. Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> --- include/uapi/linux/bpf.h | 8 ++++++++ net/core/filter.c | 26 +++++++++++++++++++++++++- tools/include/uapi/linux/bpf.h | 8 ++++++++ 3 files changed, 41 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 464ca3f01fe7..dde1c2ea1c84 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2644,6 +2644,12 @@ union bpf_attr { * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the * L2 type as Ethernet. * + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: + * Indicate the new IP header version after decapsulating the + * outer IP header. Mainly used in scenarios that the inner and + * outer IP versions are different. + * * A call to this helper is susceptible to change the underlying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -5803,6 +5809,8 @@ enum { BPF_F_ADJ_ROOM_ENCAP_L4_UDP = (1ULL << 4), BPF_F_ADJ_ROOM_NO_CSUM_RESET = (1ULL << 5), BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6), + BPF_F_ADJ_ROOM_DECAP_L3_IPV4 = (1ULL << 7), + BPF_F_ADJ_ROOM_DECAP_L3_IPV6 = (1ULL << 8), }; enum { diff --git a/net/core/filter.c b/net/core/filter.c index 43cc1fe58a2c..5fb113953f80 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3381,13 +3381,17 @@ static u32 bpf_skb_net_base_len(const struct sk_buff *skb) #define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \ BPF_F_ADJ_ROOM_ENCAP_L3_IPV6) +#define BPF_F_ADJ_ROOM_DECAP_L3_MASK (BPF_F_ADJ_ROOM_DECAP_L3_IPV4 | \ + BPF_F_ADJ_ROOM_DECAP_L3_IPV6) + #define BPF_F_ADJ_ROOM_MASK (BPF_F_ADJ_ROOM_FIXED_GSO | \ BPF_F_ADJ_ROOM_ENCAP_L3_MASK | \ BPF_F_ADJ_ROOM_ENCAP_L4_GRE | \ BPF_F_ADJ_ROOM_ENCAP_L4_UDP | \ BPF_F_ADJ_ROOM_ENCAP_L2_ETH | \ BPF_F_ADJ_ROOM_ENCAP_L2( \ - BPF_ADJ_ROOM_ENCAP_L2_MASK)) + BPF_ADJ_ROOM_ENCAP_L2_MASK) | \ + BPF_F_ADJ_ROOM_DECAP_L3_MASK) static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff, u64 flags) @@ -3501,6 +3505,7 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff, int ret; if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO | + BPF_F_ADJ_ROOM_DECAP_L3_MASK | BPF_F_ADJ_ROOM_NO_CSUM_RESET))) return -EINVAL; @@ -3519,6 +3524,14 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff, if (unlikely(ret < 0)) return ret; + /* Match skb->protocol to new outer l3 protocol */ + if (skb->protocol == htons(ETH_P_IP) && + flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6) + skb->protocol = htons(ETH_P_IPV6); + else if (skb->protocol == htons(ETH_P_IPV6) && + flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4) + skb->protocol = htons(ETH_P_IP); + if (skb_is_gso(skb)) { struct skb_shared_info *shinfo = skb_shinfo(skb); @@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, if (unlikely(proto != htons(ETH_P_IP) && proto != htons(ETH_P_IPV6))) return -ENOTSUPP; + if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) == + BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink && + flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK))) + return -EINVAL; off = skb_mac_header_len(skb); switch (mode) { @@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, return -ENOTSUPP; } + if (shrink) { + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6) + len_min = sizeof(struct ipv6hdr); + else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4) + len_min = sizeof(struct iphdr); + } + len_cur = skb->len - skb_network_offset(skb); if ((shrink && (len_diff_abs >= len_cur || len_cur - len_diff_abs < len_min)) || diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 464ca3f01fe7..22672e5c8466 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2644,6 +2644,12 @@ union bpf_attr { * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the * L2 type as Ethernet. * + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: + * Indicate the new IP header version after decapsulating the + * outer IP header. Mainly used in scenarios that the inner and + * outer IP versions are different. + * * A call to this helper is susceptible to change the underlying * packet buffer. Therefore, at load time, all checks on pointers * previously done by the verifier are invalidated and must be @@ -5803,6 +5809,8 @@ enum { BPF_F_ADJ_ROOM_ENCAP_L4_UDP = (1ULL << 4), BPF_F_ADJ_ROOM_NO_CSUM_RESET = (1ULL << 5), BPF_F_ADJ_ROOM_ENCAP_L2_ETH = (1ULL << 6), + BPF_F_ADJ_ROOM_DECAP_L3_IPV4 = (1ULL << 7), + BPF_F_ADJ_ROOM_DECAP_L3_IPV6 = (1ULL << 8), }; enum { -- 2.25.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() 2023-01-11 8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan @ 2023-01-11 15:43 ` Willem de Bruijn 2023-01-12 7:15 ` Ziyang Xuan (William) 2023-01-12 1:26 ` Martin KaFai Lau 1 sibling, 1 reply; 8+ messages in thread From: Willem de Bruijn @ 2023-01-11 15:43 UTC (permalink / raw) To: Ziyang Xuan Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa On Wed, Jan 11, 2023 at 3:01 AM Ziyang Xuan <william.xuanziyang@huawei.com> wrote: > > Add ipip6 and ip6ip decap support for bpf_skb_adjust_room(). > Main use case is for using cls_bpf on ingress hook to decapsulate > IPv4 over IPv6 and IPv6 over IPv4 tunnel packets. > > Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the > new IP header version after decapsulating the outer IP header. > > Suggested-by: Willem de Bruijn <willemb@google.com> > Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> > --- > include/uapi/linux/bpf.h | 8 ++++++++ > net/core/filter.c | 26 +++++++++++++++++++++++++- > tools/include/uapi/linux/bpf.h | 8 ++++++++ > 3 files changed, 41 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 464ca3f01fe7..dde1c2ea1c84 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2644,6 +2644,12 @@ union bpf_attr { > * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the > * L2 type as Ethernet. > * > + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, > + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: > + * Indicate the new IP header version after decapsulating the > + * outer IP header. Mainly used in scenarios that the inner and > + * outer IP versions are different. > + * Nit (only since I have another comment below) Indicate -> Set [Mainly used .. that] -> [Used when] > if (skb_is_gso(skb)) { > struct skb_shared_info *shinfo = skb_shinfo(skb); > > @@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, > if (unlikely(proto != htons(ETH_P_IP) && > proto != htons(ETH_P_IPV6))) > return -ENOTSUPP; > + if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) == > + BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink && > + flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK))) > + return -EINVAL; > > off = skb_mac_header_len(skb); > switch (mode) { > @@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, > return -ENOTSUPP; > } > > + if (shrink) { > + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6) > + len_min = sizeof(struct ipv6hdr); > + else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4) > + len_min = sizeof(struct iphdr); > + } > + How about combining this branch with the above: if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) { if (!shrink) return -EINVAL; switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) { case BPF_F_ADJ_ROOM_DECAP_L3_IPV4: len_min = sizeof(struct iphdr); break; case BPF_F_ADJ_ROOM_DECAP_L3_IPV6: len_min = sizeof(struct ipv6hdr); break; default: return -EINVAL; } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() 2023-01-11 15:43 ` Willem de Bruijn @ 2023-01-12 7:15 ` Ziyang Xuan (William) 0 siblings, 0 replies; 8+ messages in thread From: Ziyang Xuan (William) @ 2023-01-12 7:15 UTC (permalink / raw) To: Willem de Bruijn Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa > On Wed, Jan 11, 2023 at 3:01 AM Ziyang Xuan > <william.xuanziyang@huawei.com> wrote: >> >> Add ipip6 and ip6ip decap support for bpf_skb_adjust_room(). >> Main use case is for using cls_bpf on ingress hook to decapsulate >> IPv4 over IPv6 and IPv6 over IPv4 tunnel packets. >> >> Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the >> new IP header version after decapsulating the outer IP header. >> >> Suggested-by: Willem de Bruijn <willemb@google.com> >> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> >> --- >> include/uapi/linux/bpf.h | 8 ++++++++ >> net/core/filter.c | 26 +++++++++++++++++++++++++- >> tools/include/uapi/linux/bpf.h | 8 ++++++++ >> 3 files changed, 41 insertions(+), 1 deletion(-) >> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >> index 464ca3f01fe7..dde1c2ea1c84 100644 >> --- a/include/uapi/linux/bpf.h >> +++ b/include/uapi/linux/bpf.h >> @@ -2644,6 +2644,12 @@ union bpf_attr { >> * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the >> * L2 type as Ethernet. >> * >> + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, >> + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: >> + * Indicate the new IP header version after decapsulating the >> + * outer IP header. Mainly used in scenarios that the inner and >> + * outer IP versions are different. >> + * > > Nit (only since I have another comment below) > > Indicate -> Set Sorry, I think "Indicate" maybe more suitable. Because the new IP header is original inner IP header, it's not be changed. The flags assist the kernel to better complete specific tasks. I think "Set" has a meaning of change. > [Mainly used .. that] -> [Used when] This looks good to me. Thanks! > >> if (skb_is_gso(skb)) { >> struct skb_shared_info *shinfo = skb_shinfo(skb); >> >> @@ -3596,6 +3609,10 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, >> if (unlikely(proto != htons(ETH_P_IP) && >> proto != htons(ETH_P_IPV6))) >> return -ENOTSUPP; >> + if (unlikely((shrink && ((flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) == >> + BPF_F_ADJ_ROOM_DECAP_L3_MASK)) || (!shrink && >> + flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK))) >> + return -EINVAL; >> >> off = skb_mac_header_len(skb); >> switch (mode) { >> @@ -3608,6 +3625,13 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, >> return -ENOTSUPP; >> } >> >> + if (shrink) { >> + if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6) >> + len_min = sizeof(struct ipv6hdr); >> + else if (flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4) >> + len_min = sizeof(struct iphdr); >> + } >> + > > How about combining this branch with the above: > > if (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) { > if (!shrink) > return -EINVAL; > > switch (flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK) { > case BPF_F_ADJ_ROOM_DECAP_L3_IPV4: > len_min = sizeof(struct iphdr); > break; > case BPF_F_ADJ_ROOM_DECAP_L3_IPV6: > len_min = sizeof(struct ipv6hdr); > break; > default: > return -EINVAL; > } > This looks good to me. Thanks! > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() 2023-01-11 8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan 2023-01-11 15:43 ` Willem de Bruijn @ 2023-01-12 1:26 ` Martin KaFai Lau 1 sibling, 0 replies; 8+ messages in thread From: Martin KaFai Lau @ 2023-01-12 1:26 UTC (permalink / raw) To: Ziyang Xuan, ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf, netdev, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa On 1/11/23 12:01 AM, Ziyang Xuan wrote: > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 464ca3f01fe7..dde1c2ea1c84 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2644,6 +2644,12 @@ union bpf_attr { > * Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the > * L2 type as Ethernet. > * > + * * **BPF_F_ADJ_ROOM_DECAP_L3_IPV4**, > + * **BPF_F_ADJ_ROOM_DECAP_L3_IPV6**: > + * Indicate the new IP header version after decapsulating the > + * outer IP header. Mainly used in scenarios that the inner and > + * outer IP versions are different. > + * selftests/bpf failed to compile. It is probably because there is leading spaces instead of using tabs: https://github.com/kernel-patches/bpf/actions/runs/3890850490/jobs/6640395038#step:11:112 /tmp/work/bpf/bpf/tools/testing/selftests/bpf/bpf-helpers.rst:1112: (WARNING/2) Bullet list ends without a blank line; unexpected unindent. make[1]: *** [Makefile.docs:76: /tmp/work/bpf/bpf/tools/testing/selftests/bpf/bpf-helpers.7] Error 12 make: *** [Makefile:259: docs] Error 2 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel 2023-01-11 8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan 2023-01-11 8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan @ 2023-01-11 8:01 ` Ziyang Xuan 2023-01-11 15:47 ` Willem de Bruijn 1 sibling, 1 reply; 8+ messages in thread From: Ziyang Xuan @ 2023-01-11 8:01 UTC (permalink / raw) To: ast, daniel, andrii, davem, edumazet, kuba, pabeni, willemb, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room() correctly decapsulate ipip6 and ip6ip tunnel packets. Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> --- .../selftests/bpf/progs/test_tc_tunnel.c | 91 ++++++++++++++++++- tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +-- 2 files changed, 98 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c index a0e7762b1e5a..e6e678aa9874 100644 --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000; #define VXLAN_FLAGS 0x8 #define VXLAN_VNI 1 +#ifndef NEXTHDR_DEST +#define NEXTHDR_DEST 60 +#endif + /* MPLS label 1000 with S bit (last label) set and ttl of 255. */ static const __u32 mpls_label = __bpf_constant_htonl(1000 << 12 | MPLS_LS_S_MASK | 0xff); @@ -363,6 +367,61 @@ static __always_inline int __encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, return TC_ACT_OK; } +static int encap_ipv6_ipip6(struct __sk_buff *skb) +{ + struct iphdr iph_inner; + struct v6hdr h_outer; + struct tcphdr tcph; + struct ethhdr eth; + __u64 flags; + int olen; + + if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph_inner, + sizeof(iph_inner)) < 0) + return TC_ACT_OK; + + /* filter only packets we want */ + if (bpf_skb_load_bytes(skb, ETH_HLEN + (iph_inner.ihl << 2), + &tcph, sizeof(tcph)) < 0) + return TC_ACT_OK; + + if (tcph.dest != __bpf_constant_htons(cfg_port)) + return TC_ACT_OK; + + olen = sizeof(h_outer.ip); + + flags = BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_ENCAP_L3_IPV6; + + /* add room between mac and network header */ + if (bpf_skb_adjust_room(skb, olen, BPF_ADJ_ROOM_MAC, flags)) + return TC_ACT_SHOT; + + /* prepare new outer network header */ + memset(&h_outer.ip, 0, sizeof(h_outer.ip)); + h_outer.ip.version = 6; + h_outer.ip.hop_limit = iph_inner.ttl; + h_outer.ip.saddr.s6_addr[1] = 0xfd; + h_outer.ip.saddr.s6_addr[15] = 1; + h_outer.ip.daddr.s6_addr[1] = 0xfd; + h_outer.ip.daddr.s6_addr[15] = 2; + h_outer.ip.payload_len = iph_inner.tot_len; + h_outer.ip.nexthdr = IPPROTO_IPIP; + + /* store new outer network header */ + if (bpf_skb_store_bytes(skb, ETH_HLEN, &h_outer, olen, + BPF_F_INVALIDATE_HASH) < 0) + return TC_ACT_SHOT; + + /* update eth->h_proto */ + if (bpf_skb_load_bytes(skb, 0, ð, sizeof(eth)) < 0) + return TC_ACT_SHOT; + eth.h_proto = bpf_htons(ETH_P_IPV6); + if (bpf_skb_store_bytes(skb, 0, ð, sizeof(eth), 0) < 0) + return TC_ACT_SHOT; + + return TC_ACT_OK; +} + static __always_inline int encap_ipv6(struct __sk_buff *skb, __u8 encap_proto, __u16 l2_proto) { @@ -461,6 +520,15 @@ int __encap_ip6tnl_none(struct __sk_buff *skb) return TC_ACT_OK; } +SEC("encap_ipip6_none") +int __encap_ipip6_none(struct __sk_buff *skb) +{ + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) + return encap_ipv6_ipip6(skb); + else + return TC_ACT_OK; +} + SEC("encap_ip6gre_none") int __encap_ip6gre_none(struct __sk_buff *skb) { @@ -528,13 +596,33 @@ int __encap_ip6vxlan_eth(struct __sk_buff *skb) static int decap_internal(struct __sk_buff *skb, int off, int len, char proto) { + __u64 flags = BPF_F_ADJ_ROOM_FIXED_GSO; + struct ipv6_opt_hdr ip6_opt_hdr; struct gre_hdr greh; struct udphdr udph; int olen = len; switch (proto) { case IPPROTO_IPIP: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4; + break; case IPPROTO_IPV6: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6; + break; + case NEXTHDR_DEST: + if (bpf_skb_load_bytes(skb, off + len, &ip6_opt_hdr, + sizeof(ip6_opt_hdr)) < 0) + return TC_ACT_OK; + switch (ip6_opt_hdr.nexthdr) { + case IPPROTO_IPIP: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV4; + break; + case IPPROTO_IPV6: + flags |= BPF_F_ADJ_ROOM_DECAP_L3_IPV6; + break; + default: + return TC_ACT_OK; + } break; case IPPROTO_GRE: olen += sizeof(struct gre_hdr); @@ -569,8 +657,7 @@ static int decap_internal(struct __sk_buff *skb, int off, int len, char proto) return TC_ACT_OK; } - if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, - BPF_F_ADJ_ROOM_FIXED_GSO)) + if (bpf_skb_adjust_room(skb, -olen, BPF_ADJ_ROOM_MAC, flags)) return TC_ACT_SHOT; return TC_ACT_OK; diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh index 334bdfeab940..910044f08908 100755 --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh @@ -100,6 +100,9 @@ if [[ "$#" -eq "0" ]]; then echo "ipip" $0 ipv4 ipip none 100 + echo "ipip6" + $0 ipv4 ipip6 none 100 + echo "ip6ip6" $0 ipv6 ip6tnl none 100 @@ -224,6 +227,9 @@ elif [[ "$tuntype" =~ "gre" && "$mac" == "eth" ]]; then elif [[ "$tuntype" =~ "vxlan" && "$mac" == "eth" ]]; then ttype="vxlan" targs="id 1 dstport 8472 udp6zerocsumrx" +elif [[ "$tuntype" == "ipip6" ]]; then + ttype="ip6tnl" + targs="" else ttype=$tuntype targs="" @@ -233,6 +239,9 @@ fi if [[ "${tuntype}" == "sit" ]]; then link_addr1="${ns1_v4}" link_addr2="${ns2_v4}" +elif [[ "${tuntype}" == "ipip6" ]]; then + link_addr1="${ns1_v6}" + link_addr2="${ns2_v6}" else link_addr1="${addr1}" link_addr2="${addr2}" @@ -287,12 +296,6 @@ else server_listen fi -# bpf_skb_net_shrink does not take tunnel flags yet, cannot update L3. -if [[ "${tuntype}" == "sit" ]]; then - echo OK - exit 0 -fi - # serverside, use BPF for decap ip netns exec "${ns2}" ip link del dev testtun0 ip netns exec "${ns2}" tc qdisc add dev veth2 clsact -- 2.25.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel 2023-01-11 8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan @ 2023-01-11 15:47 ` Willem de Bruijn 2023-01-12 8:20 ` Ziyang Xuan (William) 0 siblings, 1 reply; 8+ messages in thread From: Willem de Bruijn @ 2023-01-11 15:47 UTC (permalink / raw) To: Ziyang Xuan Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa On Wed, Jan 11, 2023 at 3:02 AM Ziyang Xuan <william.xuanziyang@huawei.com> wrote: > > Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room() > correctly decapsulate ipip6 and ip6ip tunnel packets. > > Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> > --- > .../selftests/bpf/progs/test_tc_tunnel.c | 91 ++++++++++++++++++- > tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +-- > 2 files changed, 98 insertions(+), 8 deletions(-) > > diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > index a0e7762b1e5a..e6e678aa9874 100644 > --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c > @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000; > #define VXLAN_FLAGS 0x8 > #define VXLAN_VNI 1 > > +#ifndef NEXTHDR_DEST > +#define NEXTHDR_DEST 60 > +#endif Should not be needed if including the right header? include/net/ipv6.h Otherwise very nice extension. Thanks for expanding the test. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel 2023-01-11 15:47 ` Willem de Bruijn @ 2023-01-12 8:20 ` Ziyang Xuan (William) 0 siblings, 0 replies; 8+ messages in thread From: Ziyang Xuan (William) @ 2023-01-12 8:20 UTC (permalink / raw) To: Willem de Bruijn Cc: ast, daniel, andrii, davem, edumazet, kuba, pabeni, bpf, netdev, martin.lau, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa > On Wed, Jan 11, 2023 at 3:02 AM Ziyang Xuan > <william.xuanziyang@huawei.com> wrote: >> >> Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room() >> correctly decapsulate ipip6 and ip6ip tunnel packets. >> >> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> >> --- >> .../selftests/bpf/progs/test_tc_tunnel.c | 91 ++++++++++++++++++- >> tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 +-- >> 2 files changed, 98 insertions(+), 8 deletions(-) >> >> diff --git a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c >> index a0e7762b1e5a..e6e678aa9874 100644 >> --- a/tools/testing/selftests/bpf/progs/test_tc_tunnel.c >> +++ b/tools/testing/selftests/bpf/progs/test_tc_tunnel.c >> @@ -38,6 +38,10 @@ static const int cfg_udp_src = 20000; >> #define VXLAN_FLAGS 0x8 >> #define VXLAN_VNI 1 >> >> +#ifndef NEXTHDR_DEST >> +#define NEXTHDR_DEST 60 >> +#endif > > Should not be needed if including the right header? include/net/ipv6.h > > Otherwise very nice extension. Thanks for expanding the test. "net/ipv6.h" do not under /usr/include/ and can not be included in bpf programs. > . > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-01-12 8:23 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-01-11 8:01 [PATCH bpf-next v2 0/2] bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Ziyang Xuan 2023-01-11 8:01 ` [PATCH bpf-next v2 1/2] " Ziyang Xuan 2023-01-11 15:43 ` Willem de Bruijn 2023-01-12 7:15 ` Ziyang Xuan (William) 2023-01-12 1:26 ` Martin KaFai Lau 2023-01-11 8:01 ` [PATCH bpf-next v2 2/2] selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel Ziyang Xuan 2023-01-11 15:47 ` Willem de Bruijn 2023-01-12 8:20 ` Ziyang Xuan (William)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).