* [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors
@ 2026-05-04 16:30 Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 1/7] seg6: add End.MAP behavior Yuya Kusakabe
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
This series adds the in-kernel data path for the SRv6 Mobile User
Plane (MUP) architecture defined in RFC 9433. SRv6 MUP integrates
GTP-U mobile traffic into an SRv6 transport domain by mapping the
5-tuple (TEID, QFI, R, U, PDU Session ID) into a single SID, allowing
operators to replace the GTP-U overlay between the gNB and the
upstream UPF with native SRv6 forwarding while keeping the radio side
unchanged.
The series implements the six MUP behaviors that an SRv6 MUP gateway
typically needs:
End.MAP (RFC 9433 Section 6.2) -- swap DA with the next SID
without consuming the SRH
End.M.GTP6.D (Section 6.3) -- IPv6/GTP-U to SRv6 headend encap
End.M.GTP6.D.Di (Section 6.4) -- drop-in mode variant of the above
(preserves the original outer DA at
SRH[0] and discards TEID/QFI)
End.M.GTP6.E (Section 6.5) -- SRv6 to IPv6/GTP-U egress encap
End.M.GTP4.E (Section 6.6) -- SRv6 to IPv4/GTP-U egress encap
H.M.GTP4.D (Section 6.7) -- IPv4/GTP-U to SRv6 headend encap
End.Limit (RFC 9433 Section 6.8) is intentionally out of scope.
All behaviors plug into the existing seg6_local lwtunnel framework, so
they are configurable through the standard "ip route ... encap
seg6local action ..." interface. No new netlink families are
introduced -- the new SEG6_LOCAL_MOBILE_* attributes extend
SEG6_LOCAL_MAX in an add-only way, and the new SEG6_LOCAL_ACTION_*
values are appended.
The egress behaviors (End.M.GTP4.E and End.M.GTP6.E) accept an
optional per-route pdu_type attribute that is the sole control
for inserting the GTP-U PDU Session Container (3GPP TS 38.415 Section
5.5.2). When pdu_type is set (dl/ul/0..15), every emitted GTP-U
packet carries the container with that PDU Type and the QFI extracted
from Args.Mob.Session. When pdu_type is unset, the egress emits
a short GTPv1-U header with no container. pdu_type must be
configured on egress routes serving 5G N3 traffic; omitting it is
intended only for LTE-only / S1-U-style deployments where no PDU
Session Container is exchanged.
The matching iproute2 patch series has been posted to iproute2-next:
https://lore.kernel.org/netdev/20260505-seg6-mobile-v2-0-93291b7b0134@gmail.com/
Link: https://www.rfc-editor.org/rfc/rfc9433
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
Changes in v2 (all reported by netdev CI, except the End.MAP one
which was caught while reviewing v1):
- patch 1 (End.MAP): drop the explicit hop_limit decrement and
the hop_limit <= 1 ICMPv6 Time Exceeded check; ip6_forward()
on the way out already does both, so the explicit ones caused
a double decrement (verified hlim=64 -> 62 instead of 63).
Now consistent with End / End.X / End.M.GTP*.
- patch 3 (End.M.GTP6.E): add missing #include
<net/ip6_checksum.h> to fix clang / allmodconfig build.
- selftests: silence shellcheck false positives (SC2034/SC2154)
and sort TEST_PROGS entries alphabetically.
- Link to v1: https://lore.kernel.org/netdev/20260504-srv6-mup-v1-v1-0-e0a6791575cb@gmail.com
---
Yuya Kusakabe (7):
seg6: add End.MAP behavior
seg6: add End.M.GTP4.E behavior
seg6: add End.M.GTP6.E behavior
seg6: add End.M.GTP6.D behavior
seg6: add End.M.GTP6.D.Di behavior
seg6: add H.M.GTP4.D behavior
Documentation: networking: add seg6_mobile guide
Documentation/networking/index.rst | 1 +
Documentation/networking/seg6_mobile.rst | 236 ++
include/net/dropreason-core.h | 40 +
include/uapi/linux/seg6_local.h | 17 +
net/ipv6/seg6_local.c | 2660 ++++++++++++++++++--
tools/testing/selftests/net/Makefile | 6 +
.../selftests/net/srv6_end_m_gtp4_e_test.sh | 486 ++++
.../selftests/net/srv6_end_m_gtp6_d_di_test.sh | 427 ++++
.../selftests/net/srv6_end_m_gtp6_d_test.sh | 497 ++++
.../selftests/net/srv6_end_m_gtp6_e_test.sh | 402 +++
tools/testing/selftests/net/srv6_end_map_test.sh | 103 +
.../testing/selftests/net/srv6_h_m_gtp4_d_test.sh | 487 ++++
12 files changed, 5155 insertions(+), 207 deletions(-)
---
base-commit: 98878ed91b68a3150126fccef125ee7b1bb86ab2
change-id: 20260504-seg6-mobile-f78e282e615c
Best regards,
--
Yuya Kusakabe <yuya.kusakabe@gmail.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/7] seg6: add End.MAP behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 2/7] seg6: add End.M.GTP4.E behavior Yuya Kusakabe
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the End.MAP behavior (RFC 9433 Section 6.2): an endpoint that
replaces the IPv6 destination address with a configured next SID
and forwards via IPv6 routing without consuming the SRH. The new
nh6 attribute selects the replacement SID.
Add three drop reasons that End.MAP emits to dropreason-core.h, so
dropped packets show up in the standard skb:kfree_skb tracepoint:
SEG6_MOBILE_INVALID_SRH_SL
SEG6_MOBILE_HOP_LIMIT_EXCEEDED
SEG6_MOBILE_NOMEM
Configuration:
ip -6 route add 2001:db8:f::/64 \
encap seg6local action End.MAP nh6 2001:db8:1::e \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.2
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/net/dropreason-core.h | 12 +++
include/uapi/linux/seg6_local.h | 2 +
net/ipv6/seg6_local.c | 73 ++++++++++++++++
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/srv6_end_map_test.sh | 103 +++++++++++++++++++++++
5 files changed, 191 insertions(+)
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index e0ca3904ff8e..1be5c54d7605 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -127,6 +127,8 @@
FN(PSP_INPUT) \
FN(PSP_OUTPUT) \
FN(RECURSION_LIMIT) \
+ FN(SEG6_MOBILE_INVALID_SRH_SL) \
+ FN(SEG6_MOBILE_NOMEM) \
FNe(MAX)
/**
@@ -600,6 +602,16 @@ enum skb_drop_reason {
SKB_DROP_REASON_PSP_OUTPUT,
/** @SKB_DROP_REASON_RECURSION_LIMIT: Dead loop on virtual device. */
SKB_DROP_REASON_RECURSION_LIMIT,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL: invalid Segments Left
+ * value or SRH validation failure on an SRv6 Mobile path.
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_NOMEM: skb head/tail expansion or
+ * helper allocation failed on an SRv6 Mobile path.
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_NOMEM,
/**
* @SKB_DROP_REASON_MAX: the maximum of core drop reasons, which
* shouldn't be used as a real 'reason' - only for tracing code gen
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index 4fdc424c9cb3..45386fdfa821 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -67,6 +67,8 @@ enum {
SEG6_LOCAL_ACTION_END_BPF = 15,
/* decap and lookup of DA in v4 or v6 table */
SEG6_LOCAL_ACTION_END_DT46 = 16,
+ /* swap DA with new SID, leave SRH untouched (RFC 9433 Section 6.2) */
+ SEG6_LOCAL_ACTION_END_MAP = 17,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 2b41e4c0dddd..bd8e3312973f 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -1468,6 +1468,73 @@ static int input_action_end_bpf(struct sk_buff *skb,
return -EINVAL;
}
+/* SRH validation helper for SRv6 Mobile (RFC 9433) behaviors that may
+ * receive an SRv6 encapsulated packet. Returns the SRH on success or
+ * NULL on validation failure / when the SRH is absent. The caller
+ * uses @missing to distinguish the two NULL cases: an SRH-less packet
+ * may be acceptable depending on the behavior.
+ */
+static struct ipv6_sr_hdr *seg6_mobile_get_validated_srh(struct sk_buff *skb,
+ bool *missing)
+{
+ struct ipv6_sr_hdr *srh = seg6_get_srh(skb, 0);
+
+ if (!srh) {
+ if (missing)
+ *missing = true;
+ return NULL;
+ }
+ if (missing)
+ *missing = false;
+
+#ifdef CONFIG_IPV6_SEG6_HMAC
+ if (!seg6_hmac_validate_skb(skb))
+ return NULL;
+#endif
+ return srh;
+}
+
+/* RFC 9433 Section 6.2 -- End.MAP
+ * Replace the outer IPv6 destination address with the configured next
+ * SID, decrement the Hop Limit, and forward via IPv6 routing. The
+ * SRH is left untouched, so any subsequent End* behavior continues to
+ * see the original Segment List unchanged.
+ */
+static int input_action_end_map(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ enum skb_drop_reason reason;
+ struct ipv6_sr_hdr *srh;
+ struct ipv6hdr *ip6h;
+ bool no_srh = false;
+
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+
+ /* When an SRH is present it must HMAC-validate before we touch
+ * the destination; an SRH-less packet is also accepted because
+ * End.MAP does not consume the SRH.
+ */
+ srh = seg6_mobile_get_validated_srh(skb, &no_srh);
+ if (!srh && !no_srh)
+ goto drop;
+
+ if (skb_ensure_writable(skb, sizeof(*ip6h))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_NOMEM;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ ip6h->daddr = slwt->nh6;
+
+ skb_dst_drop(skb);
+ seg6_lookup_nexthop(skb, NULL, 0);
+ return dst_input(skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
static struct seg6_action_desc seg6_action_table[] = {
{
.action = SEG6_LOCAL_ACTION_END,
@@ -1565,6 +1632,12 @@ static struct seg6_action_desc seg6_action_table[] = {
.optattrs = SEG6_F_LOCAL_COUNTERS,
.input = input_action_end_bpf,
},
+ {
+ .action = SEG6_LOCAL_ACTION_END_MAP,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
+ .optattrs = SEG6_F_LOCAL_COUNTERS,
+ .input = input_action_end_map,
+ },
};
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index a275ed584026..4fbb1eff79f8 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -90,6 +90,7 @@ TEST_PROGS := \
srv6_end_dx4_netfilter_test.sh \
srv6_end_dx6_netfilter_test.sh \
srv6_end_flavors_test.sh \
+ srv6_end_map_test.sh \
srv6_end_next_csid_l3vpn_test.sh \
srv6_end_x_next_csid_l3vpn_test.sh \
srv6_hencap_red_l3vpn_test.sh \
diff --git a/tools/testing/selftests/net/srv6_end_map_test.sh b/tools/testing/selftests/net/srv6_end_map_test.sh
new file mode 100755
index 000000000000..7ee54b4cc97f
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_map_test.sh
@@ -0,0 +1,103 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 End.MAP behavior (RFC 9433 Section 6.2).
+#
+# +--------+ 2001:db8:1::/64 +--------+ 2001:db8:2::/64 +--------+
+# | srupf1 | ------------------- | srupf2 | ------------------- | srupf3 |
+# +--------+ veth-1 +--------+ veth-2 +--------+
+# (intermediate
+# SRv6-aware UPF,
+# End.MAP)
+#
+# All three netns are SRv6-aware UPFs in the RFC 9433 sense (not
+# 3GPP UPFs). Per RFC 9433 Section 6.2 End.MAP is used by the
+# intermediate UPF (here srupf2): srupf2 has an End.MAP SID for
+# locator 2001:db8:f::/64 mapping to the new SID 2001:db8:2::e.
+# srupf1 sends an IPv6 packet to 2001:db8:f::1; on srupf3 the
+# destination address is expected to have been replaced by
+# 2001:db8:2::e.
+
+source lib.sh
+
+readonly TIMEOUT=4
+
+cleanup()
+{
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns srupf1 srupf2 srupf3
+
+ ip -n "$srupf1" link set lo up
+ ip -n "$srupf2" link set lo up
+ ip -n "$srupf3" link set lo up
+
+ ip link add veth-1 netns "$srupf1" type veth peer name veth-1-srupf2 \
+ netns "$srupf2"
+ ip -n "$srupf1" addr add 2001:db8:1::1/64 dev veth-1 nodad
+ ip -n "$srupf2" addr add 2001:db8:1::2/64 dev veth-1-srupf2 nodad
+ ip -n "$srupf1" link set veth-1 up
+ ip -n "$srupf2" link set veth-1-srupf2 up
+
+ ip link add veth-2 netns "$srupf2" type veth peer name veth-2-srupf3 \
+ netns "$srupf3"
+ ip -n "$srupf2" addr add 2001:db8:2::1/64 dev veth-2 nodad
+ ip -n "$srupf3" addr add 2001:db8:2::e/64 dev veth-2-srupf3 nodad
+ ip -n "$srupf2" link set veth-2 up
+ ip -n "$srupf3" link set veth-2-srupf3 up
+
+ ip netns exec "$srupf2" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ ip -n "$srupf1" -6 route add 2001:db8:f::/64 via 2001:db8:1::2
+
+ ip -n "$srupf2" -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.MAP nh6 2001:db8:2::e \
+ dev veth-2
+
+ # allow srupf3 to reply back to srupf1
+ ip -n "$srupf3" -6 route add 2001:db8:1::/64 via 2001:db8:2::1
+}
+
+check_dependencies()
+{
+ if ! command -v ping >/dev/null; then
+ echo "SKIP: ping is required"; exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "End.MAP"; then
+ echo "SKIP: iproute2 too old, missing seg6local action End.MAP"
+ exit "$ksft_skip"
+ fi
+}
+
+run_test()
+{
+ # srupf3 replies to ICMPv6 echo on 2001:db8:2::e, so a successful
+ # ping from srupf1 to the End.MAP SID demonstrates that the action
+ # replaced the destination address with 2001:db8:2::e.
+ if ! ip netns exec "$srupf1" ping -6 -c 1 -W "$TIMEOUT" \
+ 2001:db8:f::1 >/dev/null 2>&1; then
+ return 1
+ fi
+ return 0
+}
+
+main()
+{
+ check_dependencies
+ setup
+
+ if run_test; then
+ echo "TEST: End.MAP [PASS]"; exit "$ksft_pass"
+ else
+ echo "TEST: End.MAP [FAIL]"; exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/7] seg6: add End.M.GTP4.E behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 1/7] seg6: add End.MAP behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 3/7] seg6: add End.M.GTP6.E behavior Yuya Kusakabe
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the End.M.GTP4.E behavior (RFC 9433 Section 6.6), which
decapsulates an inbound SRv6 packet and re-encapsulates the inner
T-PDU in IPv4/UDP/GTP-U toward a legacy IPv4 receiver, including an
optional PDU Session extension header.
The SID layout per RFC 9433 Section 6.6 Figure 9 is:
|<-- locator -->|<-- IPv4 DA -->|<-- Args.Mob.Session -->|<-pad->|
The IPv4 destination is recovered from the SID; the IPv4 source is
recovered from the inbound IPv6 source by overlaying the configured
src_addr template with the v4_mask_len bits at bit offset
v6_src_prefix_len (default 64) of the IPv6 SA, per RFC 9433
Section 6.6 Figure 10. Args.Mob.Session is the 40-bit
field defined in RFC 9433 Section 6.1: QFI(6) | R(1) | U(1) | PDU
Session ID(32).
DSCP / ECN / Hop Limit -> TTL are propagated from the inbound IPv6
outer to the new IPv4 outer per RFC 6040. GSO packets that would
not fit the egress route MTU after adding the outer headers are
rejected explicitly because the GSO segmenter cannot fix this up
after the network protocol has changed from IPv6 to IPv4.
When net.netfilter.nf_hooks_lwtunnel=1, the inner T-PDU traverses
NF_INET_PRE_ROUTING between the SRv6 strip and the GTP-U push,
mirroring End.DX4 / End.DX6.
Add four drop reasons used here to dropreason-core.h:
SEG6_MOBILE_BAD_SID, SEG6_MOBILE_BAD_GTPU, SEG6_MOBILE_BAD_INNER,
and SEG6_MOBILE_MTU_EXCEEDED.
Configuration:
ip -6 route add 2001:db8::/32 \
encap seg6local action End.M.GTP4.E \
src 2001:db8:2::1 v4_mask_len 32 \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.6
Link: https://www.rfc-editor.org/rfc/rfc6040
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/net/dropreason-core.h | 28 +
include/uapi/linux/seg6_local.h | 6 +
net/ipv6/seg6_local.c | 711 +++++++++++++++++++++
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/srv6_end_m_gtp4_e_test.sh | 486 ++++++++++++++
5 files changed, 1232 insertions(+)
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index 1be5c54d7605..c3f9a7e0727a 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -129,6 +129,10 @@
FN(RECURSION_LIMIT) \
FN(SEG6_MOBILE_INVALID_SRH_SL) \
FN(SEG6_MOBILE_NOMEM) \
+ FN(SEG6_MOBILE_BAD_SID) \
+ FN(SEG6_MOBILE_BAD_GTPU) \
+ FN(SEG6_MOBILE_BAD_INNER) \
+ FN(SEG6_MOBILE_MTU_EXCEEDED) \
FNe(MAX)
/**
@@ -612,6 +616,30 @@ enum skb_drop_reason {
* helper allocation failed on an SRv6 Mobile path.
*/
SKB_DROP_REASON_SEG6_MOBILE_NOMEM,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_BAD_SID: SRv6 Mobile (RFC 9433) SID
+ * layout violated (e.g. v4mask out of range, locator + IPv4 DA +
+ * Args.Mob.Session does not fit in the IPv6 destination).
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_BAD_SID,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU: malformed GTP-U header or
+ * GTP-U extension header on an SRv6 Mobile ingress / decap path.
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER: malformed inner IP packet
+ * on an SRv6 Mobile encap / decap path (failed pskb_may_pull,
+ * ipv6_skip_exthdr, unknown inner IP version, etc.).
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER,
+ /**
+ * @SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED: GSO packet would not
+ * fit the egress route MTU after adding the SRv6 Mobile outer
+ * headers, or the post-encap length exceeds MTU on a non-GSO IPv4
+ * input that carries DF.
+ */
+ SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED,
/**
* @SKB_DROP_REASON_MAX: the maximum of core drop reasons, which
* shouldn't be used as a real 'reason' - only for tracing code gen
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index 45386fdfa821..b42cb526bb81 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -29,6 +29,10 @@ enum {
SEG6_LOCAL_VRFTABLE,
SEG6_LOCAL_COUNTERS,
SEG6_LOCAL_FLAVORS,
+ SEG6_LOCAL_MOBILE_SRC_ADDR,
+ SEG6_LOCAL_MOBILE_V4_MASK_LEN,
+ SEG6_LOCAL_MOBILE_PDU_TYPE,
+ SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN,
__SEG6_LOCAL_MAX,
};
#define SEG6_LOCAL_MAX (__SEG6_LOCAL_MAX - 1)
@@ -69,6 +73,8 @@ enum {
SEG6_LOCAL_ACTION_END_DT46 = 16,
/* swap DA with new SID, leave SRH untouched (RFC 9433 Section 6.2) */
SEG6_LOCAL_ACTION_END_MAP = 17,
+ /* SRv6 to IPv4/GTP-U encap (RFC 9433 Section 6.6) */
+ SEG6_LOCAL_ACTION_END_M_GTP4_E = 18,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index bd8e3312973f..4051fe89e6d1 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -32,6 +32,10 @@
#include <linux/etherdevice.h>
#include <linux/bpf.h>
#include <linux/netfilter.h>
+#include <linux/udp.h>
+#include <linux/unaligned.h>
+#include <net/gso.h>
+#include <net/gtp.h>
#define SEG6_F_ATTR(i) BIT(i)
@@ -184,6 +188,17 @@ struct seg6_local_counters {
#define SEG6_F_LOCAL_COUNTERS SEG6_F_ATTR(SEG6_LOCAL_COUNTERS)
+/* Per-route configuration for SRv6 Mobile (RFC 9433) behaviors. */
+struct seg6_mobile_info {
+ struct in6_addr src_addr; /* outer IPv6 SA template */
+ u8 v4_mask_len; /* IPv4 portion length (bits) */
+ u8 pdu_type; /* PDU Type (0=downlink, 1=uplink) */
+ bool pdu_type_set; /* PDU Session Container enabled */
+ u8 v6_src_prefix_len; /* Source UPF Prefix length (bits) */
+};
+
+#define SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT 64
+
struct seg6_local_lwt {
int action;
struct ipv6_sr_hdr *srh;
@@ -197,6 +212,7 @@ struct seg6_local_lwt {
struct seg6_end_dt_info dt_info;
#endif
struct seg6_flavors_info flv_info;
+ struct seg6_mobile_info mobile_info;
struct pcpu_seg6_local_counters __percpu *pcpu_counters;
@@ -1494,6 +1510,493 @@ static struct ipv6_sr_hdr *seg6_mobile_get_validated_srh(struct sk_buff *skb,
return srh;
}
+/* Args.Mob.Session is a 40-bit field (RFC 9433 Section 6.1 Figure 8). */
+#define SEG6_MOBILE_ARGS_MOB_LEN 40
+
+/* Read @nbits from a 16-byte big-endian @addr at bit offset @bit_off,
+ * returned left-justified in 64 bits. Caller ensures bit_off + nbits
+ * <= 128 and 1 <= nbits <= 64.
+ */
+static u64 seg6_mobile_addr_get_bits(const u8 *addr, unsigned int bit_off,
+ unsigned int nbits)
+{
+ u64 hi = get_unaligned_be64(addr);
+ u64 lo = get_unaligned_be64(addr + 8);
+ u64 v;
+
+ if (bit_off == 0)
+ v = hi;
+ else if (bit_off < 64)
+ v = (hi << bit_off) | (lo >> (64 - bit_off));
+ else
+ v = lo << (bit_off - 64);
+
+ return v & GENMASK_ULL(63, 64 - nbits);
+}
+
+static bool seg6_mobile_v4_mask_valid(u8 v4_mask_len)
+{
+ return v4_mask_len > 0 && v4_mask_len <= 32;
+}
+
+/* Extract the IPv4 DA and Args.Mob.Session from an End.M.GTP4.E SID,
+ * where the SR Gateway locator occupies the leading @locator_bits
+ * bits of the IPv6 destination, the IPv4 DA the next @v4_mask_len
+ * bits, and Args.Mob.Session the 40 bits that follow it (RFC 9433
+ * Section 6.6 Figure 9).
+ */
+static bool seg6_mobile_parse_gtp4_sid(const struct in6_addr *daddr,
+ unsigned int locator_bits,
+ u8 v4_mask_len,
+ __be32 *v4_da, u64 *args_mob)
+{
+ u64 da_field;
+
+ if (!seg6_mobile_v4_mask_valid(v4_mask_len))
+ return false;
+ if (locator_bits + v4_mask_len + SEG6_MOBILE_ARGS_MOB_LEN > 128)
+ return false;
+
+ da_field = seg6_mobile_addr_get_bits(daddr->s6_addr, locator_bits,
+ v4_mask_len);
+ *v4_da = htonl((u32)(da_field >> 32));
+
+ *args_mob = seg6_mobile_addr_get_bits(daddr->s6_addr,
+ locator_bits + v4_mask_len,
+ SEG6_MOBILE_ARGS_MOB_LEN);
+ return true;
+}
+
+/* Compose the IPv4 source address per RFC 9433 Section 6.6 Figure 10:
+ * the @v4_mask_len high bits are recovered from the inbound IPv6 SA at
+ * bit offset @v6_src_prefix_len (or /64 when 0); the remaining low bits
+ * come from @src_template at the same offset.
+ */
+static __be32 seg6_mobile_v4_sa(const struct in6_addr *ip6_sa,
+ const struct in6_addr *src_template,
+ u8 v4_mask_len, u8 v6_src_prefix_len)
+{
+ u8 p_bits = v6_src_prefix_len ? : SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT;
+ u8 sa_bits = min_t(u8, v4_mask_len, 32);
+ u64 template_field, sa_field, mask;
+
+ if ((unsigned int)p_bits + 32 > 128)
+ return 0;
+
+ template_field = seg6_mobile_addr_get_bits(src_template->s6_addr,
+ p_bits, 32);
+
+ if (sa_bits) {
+ sa_field = seg6_mobile_addr_get_bits(ip6_sa->s6_addr,
+ p_bits, sa_bits);
+ mask = (sa_bits >= 64) ? ~0ULL : ((~0ULL) << (64 - sa_bits));
+ template_field = (template_field & ~mask) | (sa_field & mask);
+ }
+
+ return htonl((u32)(template_field >> 32));
+}
+
+/* Return the bit length of the routing prefix that delivered @skb to
+ * the current End.* handler (i.e. the prefix length of the matched FIB
+ * entry). This is the locator length used to position v4DA /
+ * Args.Mob.Session inside the SID per RFC 9433 Section 6.6.
+ */
+static unsigned int seg6_mobile_skb_prefix_bits(const struct sk_buff *skb)
+{
+ struct dst_entry *dst = skb_dst(skb);
+ struct rt6_info *rt;
+ struct fib6_info *fib6;
+ u8 plen = 128;
+
+ /* container_of() below requires an IPv6 dst. */
+ if (!dst || dst->ops->family != AF_INET6)
+ return 128;
+
+ rt = container_of(dst, struct rt6_info, dst);
+ rcu_read_lock();
+ fib6 = rcu_dereference(rt->from);
+ if (fib6)
+ plen = fib6->fib6_dst.plen;
+ rcu_read_unlock();
+
+ return plen;
+}
+
+/* GTP-U PDU Session extension header (3GPP TS 38.415).
+ * 4-byte minimum unit: ext_len=1, PDU Type in high 4 bits of @pdu_type_spare,
+ * QFI in low 6 bits of @spare_qfi, next_ext=0.
+ */
+struct seg6_mobile_pdu_session_ext {
+ __u8 ext_len;
+ __u8 pdu_type_spare;
+ __u8 spare_qfi;
+ __u8 next_ext;
+};
+
+#define SEG6_MOBILE_PDU_SESSION_NH 0x85 /* PDU Session extension header type */
+#define SEG6_MOBILE_PDU_SESSION_QFI_MASK 0x3f
+
+/* GTPv1-U mandatory header flags: Version=1 (bits 7..5 = 001) +
+ * Protocol Type=1 (bit 4); E/S/PN bits clear by default (3GPP TS
+ * 29.060 Figure 2 / Table 5). ORed with GTP1_F_EXTHDR / GTP1_F_SEQ
+ * / GTP1_F_NPDU when those optional fields are present.
+ */
+#define SEG6_MOBILE_GTP1U_FLAGS_BASE 0x30
+
+/* Bit shifts within the left-justified 64-bit Args.Mob.Session
+ * (RFC 9433 Section 6.1 Figure 8): QFI(6) | R(1) | U(1) | TEID(32).
+ */
+#define SEG6_MOBILE_ARGS_QFI_SHIFT 58
+#define SEG6_MOBILE_ARGS_TEID_SHIFT 24
+
+static u8 seg6_mobile_qfi_from_args(u64 args_mob)
+{
+ return (args_mob >> SEG6_MOBILE_ARGS_QFI_SHIFT) &
+ SEG6_MOBILE_PDU_SESSION_QFI_MASK;
+}
+
+static u32 seg6_mobile_teid_from_args(u64 args_mob)
+{
+ return lower_32_bits(args_mob >> SEG6_MOBILE_ARGS_TEID_SHIFT);
+}
+
+/* Push a GTP-U header on top of @skb. When @pdu_type_set is true the
+ * GTPv1 long header (with the EH bit set) is followed by a 4-byte
+ * PDU Session extension header (3GPP TS 38.415); @pdu_type selects
+ * the PDU Type field (0 for downlink, 1 for uplink, 2..15 reserved).
+ * When @pdu_type_set is false the GTPv1 short header is emitted with
+ * no PDU Session Container, regardless of @qfi.
+ */
+static int seg6_mobile_push_gtpu(struct sk_buff *skb, u32 teid, u8 qfi,
+ u8 pdu_type, bool pdu_type_set)
+{
+ struct gtp1_header_long *gtphl;
+ struct gtp1_header *gtph;
+ struct seg6_mobile_pdu_session_ext *pdu_session;
+
+ if (!pdu_type_set) {
+ if (skb_cow_head(skb, sizeof(*gtph)))
+ return -ENOMEM;
+
+ gtph = (struct gtp1_header *)skb_push(skb, sizeof(*gtph));
+ gtph->flags = SEG6_MOBILE_GTP1U_FLAGS_BASE;
+ gtph->type = GTP_TPDU;
+ gtph->length = htons(skb->len - sizeof(*gtph));
+ gtph->tid = htonl(teid);
+ return 0;
+ }
+
+ if (skb_cow_head(skb, sizeof(*gtphl) + sizeof(*pdu_session)))
+ return -ENOMEM;
+
+ pdu_session = skb_push(skb, sizeof(*pdu_session));
+ pdu_session->ext_len = 1;
+ pdu_session->pdu_type_spare = (pdu_type & 0xf) << 4;
+ pdu_session->spare_qfi = qfi & SEG6_MOBILE_PDU_SESSION_QFI_MASK;
+ pdu_session->next_ext = 0;
+
+ gtphl = (struct gtp1_header_long *)skb_push(skb, sizeof(*gtphl));
+ gtphl->flags = SEG6_MOBILE_GTP1U_FLAGS_BASE | GTP1_F_EXTHDR;
+ gtphl->type = GTP_TPDU;
+ gtphl->length = htons(skb->len - sizeof(struct gtp1_header));
+ gtphl->tid = htonl(teid);
+ gtphl->seq = 0;
+ gtphl->npdu = 0;
+ gtphl->next = SEG6_MOBILE_PDU_SESSION_NH;
+
+ return 0;
+}
+
+/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
+ * the inner T-PDU exposed by End.M.GTP4.E. After the outer SRv6 has
+ * been popped the inner IP is briefly visible to netfilter; the
+ * finish half then needs the synthesised IPv4 outer fields and the
+ * GTP-U identifiers to rebuild the packet.
+ */
+struct seg6_mobile_gtp4_e_cb {
+ __be32 v4_da;
+ __be32 v4_sa;
+ u32 teid;
+ u8 qfi;
+ u8 outer_tclass;
+ u8 outer_hoplimit;
+ u8 pdu_type;
+ bool pdu_type_set;
+};
+
+#define SEG6_MOBILE_GTP4_E_CB(skb) \
+ ((struct seg6_mobile_gtp4_e_cb *)((skb)->cb))
+
+static int input_action_end_m_gtp4_e_finish(struct net *net,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct seg6_mobile_gtp4_e_cb cb = *SEG6_MOBILE_GTP4_E_CB(skb);
+ struct dst_entry *orig_dst = skb_dst(skb);
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_NOMEM;
+ struct seg6_local_lwt *slwt;
+ struct iphdr *iph;
+ struct udphdr *uh;
+
+ slwt = seg6_local_lwtunnel(orig_dst->lwtstate);
+
+ /* Reject GSO packets that would not fit the egress IPv4 path after
+ * adding our outer headers; the GSO segmenter cannot fix this up
+ * once we have changed the network protocol from IPv6 to IPv4.
+ * The MTU check uses the inbound IPv6 dst as a conservative bound
+ * (the outbound IPv4 route is not known until ip_route_output_key()
+ * below); skip the check entirely when no MTU is known on the
+ * current dst.
+ */
+ if (skb_is_gso(skb)) {
+ unsigned int ovhd = sizeof(*iph) + sizeof(*uh) +
+ sizeof(struct gtp1_header_long) +
+ sizeof(struct seg6_mobile_pdu_session_ext);
+ unsigned int mtu = dst_mtu(skb_dst(skb));
+
+ if (mtu && (mtu <= ovhd ||
+ !skb_gso_validate_network_len(skb, mtu - ovhd))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED;
+ goto drop;
+ }
+ }
+
+ /* Reserve worst-case headroom for the entire outer chain we are about
+ * to push: IPv4 + UDP + GTP-U long header + PDU Session extension.
+ * Subsequent skb_cow_head() calls inside seg6_mobile_push_gtpu() then
+ * become no-ops.
+ */
+ if (skb_cow_head(skb,
+ sizeof(*iph) + sizeof(*uh) +
+ sizeof(struct gtp1_header_long) +
+ sizeof(struct seg6_mobile_pdu_session_ext)))
+ goto drop;
+
+ if (seg6_mobile_push_gtpu(skb, cb.teid, cb.qfi, cb.pdu_type,
+ cb.pdu_type_set))
+ goto drop;
+
+ uh = skb_push(skb, sizeof(*uh));
+ skb_reset_transport_header(skb);
+ uh->source = htons(GTP1U_PORT);
+ uh->dest = htons(GTP1U_PORT);
+ uh->len = htons(skb->len);
+ uh->check = 0; /* IPv4 UDP checksum optional; offload may set later */
+
+ iph = skb_push(skb, sizeof(*iph));
+ skb_reset_network_header(skb);
+ iph->version = 4;
+ iph->ihl = sizeof(*iph) >> 2;
+ iph->tos = cb.outer_tclass;
+ iph->tot_len = htons(skb->len);
+ iph->frag_off = htons(IP_DF);
+ iph->ttl = cb.outer_hoplimit;
+ iph->protocol = IPPROTO_UDP;
+ iph->saddr = cb.v4_sa;
+ iph->daddr = cb.v4_da;
+ __ip_select_ident(net, iph, 1);
+ iph->check = 0;
+ iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
+
+ skb->protocol = htons(ETH_P_IP);
+ nf_reset_ct(skb);
+ skb_dst_drop(skb);
+
+ /* The IPv4 outer is constructed locally from the SRv6 SID and the
+ * inbound IPv6 outer (RFC 9433 Section 6.6 Figure 9 / 10). The
+ * IPv4 source address is synthesised, so it is not guaranteed to
+ * have a reverse route; ip_route_input() + dst_input() with
+ * rp_filter enabled would drop the packet. Use an output route
+ * lookup with FLOWI_FLAG_ANYSRC and emit via dst_output(): the
+ * packet is locally originated from the IPv4 stack's perspective
+ * and traverses NF_INET_LOCAL_OUT.
+ */
+ {
+ struct rtable *rt;
+ struct flowi4 fl4 = {
+ .daddr = cb.v4_da,
+ .saddr = cb.v4_sa,
+ .flowi4_proto = IPPROTO_UDP,
+ .flowi4_flags = FLOWI_FLAG_ANYSRC,
+ .flowi4_oif = slwt->oif,
+ };
+
+ rt = ip_route_output_key(net, &fl4);
+ if (IS_ERR(rt)) {
+ reason = SKB_DROP_REASON_IP_OUTNOROUTES;
+ goto drop;
+ }
+ skb_dst_set(skb, &rt->dst);
+ return dst_output(net, NULL, skb);
+ }
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+/* RFC 9433 Section 6.6 -- End.M.GTP4.E
+ * Receives an SRv6 packet and re-encapsulates the inner payload in
+ * IPv4/UDP/GTP-U (with an optional PDU Session extension header)
+ * toward a legacy IPv4 gNB.
+ *
+ * When net.netfilter.nf_hooks_lwtunnel=1 the inner T-PDU is exposed
+ * to NF_INET_PRE_ROUTING after the outer IPv6/SRH is popped and
+ * before the GTP-U header is pushed. This lets nftables / conntrack
+ * apply policy on the inner 5-tuple at the SR Gateway.
+ */
+static int input_action_end_m_gtp4_e(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ u8 qfi, outer_tclass, outer_hoplimit;
+ unsigned int outer_len;
+ struct ipv6_sr_hdr *srh;
+ struct in6_addr ip6_sa;
+ struct seg6_mobile_gtp4_e_cb *cb;
+ bool no_srh = false;
+ int inner_nfproto;
+ __be16 frag_off;
+ __be32 v4_da, v4_sa;
+ struct ipv6hdr *ip6h;
+ u64 args_mob;
+ u32 teid;
+ u8 nh;
+ int off;
+ const struct seg6_mobile_info *minfo = &slwt->mobile_info;
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_SID;
+
+ BUILD_BUG_ON(sizeof(struct seg6_mobile_gtp4_e_cb) >
+ sizeof_field(struct sk_buff, cb));
+
+ if (!pskb_may_pull(skb, sizeof(*ip6h))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ ip6_sa = ip6h->saddr;
+
+ /* Snapshot fields read from the IPv6 outer before any pskb_may_pull()
+ * call below: seg6_mobile_get_validated_srh() invokes pskb_may_pull()
+ * internally and may reallocate skb->head, invalidating @ip6h. RFC
+ * 6040 outer-to-outer propagation: DSCP+ECN to TOS, HopLimit to TTL.
+ */
+ outer_tclass = ipv6_get_dsfield(ip6h);
+ outer_hoplimit = ip6h->hop_limit;
+
+ if (!seg6_mobile_parse_gtp4_sid(&ip6h->daddr,
+ seg6_mobile_skb_prefix_bits(skb),
+ minfo->v4_mask_len,
+ &v4_da, &args_mob))
+ goto drop;
+
+ /* Validate SRH (if present) per RFC 9433 Section 6.6 S01-S04: SL
+ * must be 0. HMAC is enforced when the SRH is present and the
+ * kernel was built with CONFIG_IPV6_SEG6_HMAC.
+ */
+ srh = seg6_mobile_get_validated_srh(skb, &no_srh);
+ if (!srh && !no_srh) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+ if (srh && srh->segments_left != 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+
+ /* @ip6h may have been invalidated by pskb_may_pull() inside
+ * seg6_mobile_get_validated_srh(); re-evaluate before any further
+ * dereference.
+ */
+ ip6h = ipv6_hdr(skb);
+
+ teid = seg6_mobile_teid_from_args(args_mob);
+ qfi = seg6_mobile_qfi_from_args(args_mob);
+
+ v4_sa = seg6_mobile_v4_sa(&ip6_sa, &minfo->src_addr, minfo->v4_mask_len,
+ minfo->v6_src_prefix_len);
+
+ /* RFC 9433 Section 6.6 upper-layer S02 mandates "Pop the IPv6
+ * header and all its extension headers". Use ipv6_skip_exthdr()
+ * so HBH / Routing / Dest-Opts / Fragment headers are accounted
+ * for in addition to the SRH. The terminal next-header value
+ * also selects NFPROTO_IPV4 / NFPROTO_IPV6 for the
+ * NF_INET_PRE_ROUTING hook below.
+ */
+ nh = ip6h->nexthdr;
+ off = ipv6_skip_exthdr(skb, sizeof(*ip6h), &nh, &frag_off);
+ if (off < 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+ outer_len = off;
+
+ switch (nh) {
+ case IPPROTO_IPIP:
+ inner_nfproto = NFPROTO_IPV4;
+ break;
+ case IPPROTO_IPV6:
+ inner_nfproto = NFPROTO_IPV6;
+ break;
+ default:
+ inner_nfproto = -1;
+ break;
+ }
+
+ /* For inner IP traffic that may traverse NF_INET_PRE_ROUTING below,
+ * pull the full inner IP header into the linear area so a netfilter
+ * hook reading skb_transport_header() does not access stale data.
+ * Non-IP inner is forwarded as-is via the GTP-U T-PDU payload.
+ */
+ if (!pskb_may_pull(skb, outer_len + ((inner_nfproto == NFPROTO_IPV4) ?
+ sizeof(struct iphdr) :
+ (inner_nfproto == NFPROTO_IPV6) ?
+ sizeof(struct ipv6hdr) : 0))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb_pull_rcsum(skb, outer_len);
+ skb_reset_network_header(skb);
+ skb_reset_transport_header(skb);
+
+ cb = SEG6_MOBILE_GTP4_E_CB(skb);
+ cb->v4_da = v4_da;
+ cb->v4_sa = v4_sa;
+ cb->teid = teid;
+ cb->qfi = qfi;
+ cb->outer_tclass = outer_tclass;
+ cb->outer_hoplimit = outer_hoplimit;
+ cb->pdu_type = minfo->pdu_type;
+ cb->pdu_type_set = minfo->pdu_type_set;
+
+ if (inner_nfproto >= 0 &&
+ static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) {
+ /* Set skb->protocol and the transport offset to match the
+ * inner header so the hook chain sees a coherent IPv4 /
+ * IPv6 packet. The finish half overwrites skb->protocol
+ * to ETH_P_IP after the IPv4 outer is pushed.
+ */
+ skb->protocol = (inner_nfproto == NFPROTO_IPV4) ?
+ htons(ETH_P_IP) : htons(ETH_P_IPV6);
+ skb_set_transport_header(skb,
+ (inner_nfproto == NFPROTO_IPV4) ?
+ sizeof(struct iphdr) :
+ sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+ return NF_HOOK(inner_nfproto, NF_INET_PRE_ROUTING,
+ dev_net(skb->dev), NULL, skb, skb->dev,
+ NULL, input_action_end_m_gtp4_e_finish);
+ }
+
+ return input_action_end_m_gtp4_e_finish(dev_net(skb->dev), NULL, skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
/* RFC 9433 Section 6.2 -- End.MAP
* Replace the outer IPv6 destination address with the configured next
* SID, decrement the Hop Limit, and forward via IPv6 routing. The
@@ -1535,6 +2038,11 @@ static int input_action_end_map(struct sk_buff *skb,
return -EINVAL;
}
+/* Forward declarations; defined next to the parse_nla_mobile_* helpers below. */
+static int seg6_mobile_v4_validate(struct seg6_local_lwt *slwt,
+ const void *cfg,
+ struct netlink_ext_ack *extack);
+
static struct seg6_action_desc seg6_action_table[] = {
{
.action = SEG6_LOCAL_ACTION_END,
@@ -1632,6 +2140,19 @@ static struct seg6_action_desc seg6_action_table[] = {
.optattrs = SEG6_F_LOCAL_COUNTERS,
.input = input_action_end_bpf,
},
+ {
+ .action = SEG6_LOCAL_ACTION_END_M_GTP4_E,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V4_MASK_LEN),
+ .optattrs = SEG6_F_LOCAL_COUNTERS |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_PDU_TYPE) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN) |
+ SEG6_F_ATTR(SEG6_LOCAL_OIF),
+ .input = input_action_end_m_gtp4_e,
+ .slwt_ops = {
+ .build_state = seg6_mobile_v4_validate,
+ },
+ },
{
.action = SEG6_LOCAL_ACTION_END_MAP,
.attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
@@ -1728,6 +2249,11 @@ static const struct nla_policy seg6_local_policy[SEG6_LOCAL_MAX + 1] = {
[SEG6_LOCAL_BPF] = { .type = NLA_NESTED },
[SEG6_LOCAL_COUNTERS] = { .type = NLA_NESTED },
[SEG6_LOCAL_FLAVORS] = { .type = NLA_NESTED },
+ [SEG6_LOCAL_MOBILE_SRC_ADDR] =
+ NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)),
+ [SEG6_LOCAL_MOBILE_V4_MASK_LEN] = { .type = NLA_U8 },
+ [SEG6_LOCAL_MOBILE_PDU_TYPE] = { .type = NLA_U8 },
+ [SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN] = { .type = NLA_U8 },
};
static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt,
@@ -1962,6 +2488,163 @@ static int cmp_nla_oif(struct seg6_local_lwt *a, struct seg6_local_lwt *b)
return 0;
}
+static int parse_nla_mobile_src_addr(struct nlattr **attrs,
+ struct seg6_local_lwt *slwt,
+ struct netlink_ext_ack *extack)
+{
+ memcpy(&slwt->mobile_info.src_addr,
+ nla_data(attrs[SEG6_LOCAL_MOBILE_SRC_ADDR]),
+ sizeof(struct in6_addr));
+ return 0;
+}
+
+static int put_nla_mobile_src_addr(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ if (nla_put(skb, SEG6_LOCAL_MOBILE_SRC_ADDR,
+ sizeof(struct in6_addr), &slwt->mobile_info.src_addr))
+ return -EMSGSIZE;
+ return 0;
+}
+
+static int cmp_nla_mobile_src_addr(struct seg6_local_lwt *a,
+ struct seg6_local_lwt *b)
+{
+ return memcmp(&a->mobile_info.src_addr, &b->mobile_info.src_addr,
+ sizeof(struct in6_addr));
+}
+
+static int parse_nla_mobile_v4_mask_len(struct nlattr **attrs,
+ struct seg6_local_lwt *slwt,
+ struct netlink_ext_ack *extack)
+{
+ u8 len = nla_get_u8(attrs[SEG6_LOCAL_MOBILE_V4_MASK_LEN]);
+
+ if (!seg6_mobile_v4_mask_valid(len)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile IPv4 mask length must be in 1..32 and leave room for the 40-bit Args.Mob.Session");
+ return -EINVAL;
+ }
+ slwt->mobile_info.v4_mask_len = len;
+ return 0;
+}
+
+static int put_nla_mobile_v4_mask_len(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ if (nla_put_u8(skb, SEG6_LOCAL_MOBILE_V4_MASK_LEN,
+ slwt->mobile_info.v4_mask_len))
+ return -EMSGSIZE;
+ return 0;
+}
+
+static int cmp_nla_mobile_v4_mask_len(struct seg6_local_lwt *a,
+ struct seg6_local_lwt *b)
+{
+ return a->mobile_info.v4_mask_len != b->mobile_info.v4_mask_len;
+}
+
+static int parse_nla_mobile_pdu_type(struct nlattr **attrs,
+ struct seg6_local_lwt *slwt,
+ struct netlink_ext_ack *extack)
+{
+ u8 t = nla_get_u8(attrs[SEG6_LOCAL_MOBILE_PDU_TYPE]);
+
+ /* 3GPP TS 38.415: PDU Type is a 4-bit field. */
+ if (t > 0xf) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile PDU Type must fit in 4 bits (0..15)");
+ return -EINVAL;
+ }
+ slwt->mobile_info.pdu_type = t;
+ slwt->mobile_info.pdu_type_set = true;
+ return 0;
+}
+
+static int put_nla_mobile_pdu_type(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ if (!slwt->mobile_info.pdu_type_set)
+ return 0;
+ if (nla_put_u8(skb, SEG6_LOCAL_MOBILE_PDU_TYPE,
+ slwt->mobile_info.pdu_type))
+ return -EMSGSIZE;
+ return 0;
+}
+
+static int cmp_nla_mobile_pdu_type(struct seg6_local_lwt *a,
+ struct seg6_local_lwt *b)
+{
+ if (a->mobile_info.pdu_type_set != b->mobile_info.pdu_type_set)
+ return 1;
+ if (!a->mobile_info.pdu_type_set)
+ return 0;
+ return a->mobile_info.pdu_type != b->mobile_info.pdu_type;
+}
+
+static int parse_nla_mobile_v6_src_prefix_len(struct nlattr **attrs,
+ struct seg6_local_lwt *slwt,
+ struct netlink_ext_ack *extack)
+{
+ u8 len = nla_get_u8(attrs[SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN]);
+
+ /* RFC 9433 Section 6.6 Figure 10: P + IPv4 SA (a bits) + padding =
+ * 128. P must be non-zero and leave room for the IPv4 SA (a >= 1)
+ * within the IPv6 source address; the cross-attribute upper bound
+ * (P + a <= 128) is enforced in the action's build_state callback.
+ */
+ if (len == 0 || len > 127) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile v6_src_prefix_len must be in 1..127");
+ return -EINVAL;
+ }
+ slwt->mobile_info.v6_src_prefix_len = len;
+ return 0;
+}
+
+static int put_nla_mobile_v6_src_prefix_len(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ if (nla_put_u8(skb, SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN,
+ slwt->mobile_info.v6_src_prefix_len))
+ return -EMSGSIZE;
+ return 0;
+}
+
+static int cmp_nla_mobile_v6_src_prefix_len(struct seg6_local_lwt *a,
+ struct seg6_local_lwt *b)
+{
+ return a->mobile_info.v6_src_prefix_len !=
+ b->mobile_info.v6_src_prefix_len;
+}
+
+/* build_state callback shared between End.M.GTP4.E and H.M.GTP4.D
+ * that performs the RFC 9433 Section 6.6 Figure 10 cross-attribute
+ * sanity check (Source UPF Prefix length P + IPv4 portion length a
+ * <= 128) using the effective P (= the configured v6_src_prefix_len
+ * or the SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT when unset).
+ */
+static int seg6_mobile_v4_validate(struct seg6_local_lwt *slwt,
+ const void *cfg,
+ struct netlink_ext_ack *extack)
+{
+ const struct seg6_mobile_info *minfo = &slwt->mobile_info;
+ u8 p_bits = minfo->v6_src_prefix_len ? :
+ SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT;
+
+ /* seg6_mobile_v4_sa() reads a 32-bit IPv4 template at @p_bits from
+ * the IPv6 SA template, so the prefix must leave room for those
+ * 32 bits. v4_mask_len is bounded to 32 separately, so this also
+ * implies p_bits + v4_mask_len <= 128.
+ */
+ if ((unsigned int)p_bits + 32 > 128) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile v6_src_prefix_len must leave room for the 32-bit IPv4 source template (prefix_len <= 96)");
+ return -EINVAL;
+ }
+ return 0;
+}
+
#define MAX_PROG_NAME 256
static const struct nla_policy bpf_prog_policy[SEG6_LOCAL_BPF_PROG_MAX + 1] = {
[SEG6_LOCAL_BPF_PROG] = { .type = NLA_U32, },
@@ -2391,6 +3074,22 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
[SEG6_LOCAL_FLAVORS] = { .parse = parse_nla_flavors,
.put = put_nla_flavors,
.cmp = cmp_nla_flavors },
+
+ [SEG6_LOCAL_MOBILE_SRC_ADDR] = { .parse = parse_nla_mobile_src_addr,
+ .put = put_nla_mobile_src_addr,
+ .cmp = cmp_nla_mobile_src_addr },
+
+ [SEG6_LOCAL_MOBILE_V4_MASK_LEN] = { .parse = parse_nla_mobile_v4_mask_len,
+ .put = put_nla_mobile_v4_mask_len,
+ .cmp = cmp_nla_mobile_v4_mask_len },
+
+ [SEG6_LOCAL_MOBILE_PDU_TYPE] = { .parse = parse_nla_mobile_pdu_type,
+ .put = put_nla_mobile_pdu_type,
+ .cmp = cmp_nla_mobile_pdu_type },
+
+ [SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN] = { .parse = parse_nla_mobile_v6_src_prefix_len,
+ .put = put_nla_mobile_v6_src_prefix_len,
+ .cmp = cmp_nla_mobile_v6_src_prefix_len },
};
/* call the destroy() callback (if available) for each set attribute in
@@ -2707,6 +3406,18 @@ static int seg6_local_get_encap_size(struct lwtunnel_state *lwt)
if (attrs & SEG6_F_ATTR(SEG6_LOCAL_FLAVORS))
nlsize += encap_size_flavors(slwt);
+ if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR))
+ nlsize += nla_total_size(16);
+
+ if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V4_MASK_LEN))
+ nlsize += nla_total_size(1);
+
+ if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_PDU_TYPE))
+ nlsize += nla_total_size(1);
+
+ if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN))
+ nlsize += nla_total_size(1);
+
return nlsize;
}
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 4fbb1eff79f8..db1b3ef48f19 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -90,6 +90,7 @@ TEST_PROGS := \
srv6_end_dx4_netfilter_test.sh \
srv6_end_dx6_netfilter_test.sh \
srv6_end_flavors_test.sh \
+ srv6_end_m_gtp4_e_test.sh \
srv6_end_map_test.sh \
srv6_end_next_csid_l3vpn_test.sh \
srv6_end_x_next_csid_l3vpn_test.sh \
diff --git a/tools/testing/selftests/net/srv6_end_m_gtp4_e_test.sh b/tools/testing/selftests/net/srv6_end_m_gtp4_e_test.sh
new file mode 100755
index 000000000000..856be2ea1813
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_m_gtp4_e_test.sh
@@ -0,0 +1,486 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 End.M.GTP4.E behavior (RFC 9433 Section 6.6).
+#
+# Three network namespaces are connected back-to-back:
+#
+# +-------+ 2001:db8:1::/64 +-------+ 10.0.0.0/24 +-------+
+# | srupf | ------------------- | srgw | ------------------- | gnb |
+# +-------+ veth-n9 +-------+ veth-n3 +-------+
+#
+# srupf is the SR-domain-side SRv6-aware UPF (RFC 9433 sense, not a
+# 3GPP UPF) that injects the SRv6 packets, gnb is the GTP-U-side
+# test peer, and srgw runs the End.M.GTP4.E behavior under test.
+#
+# On srgw an End.M.GTP4.E SID is installed with a /32 routing prefix;
+# the SID layout (per RFC 9433 Section 6.6 Figure 9) is:
+#
+# Locator | IPv4 DA (v4_mask_len bits) | Args.Mob.Session (40 bits) [| pad]
+#
+# With locator=/32 and v4_mask_len=32 the IPv4 DA lives at bytes 4..7 and
+# Args.Mob.Session at bytes 8..12; bytes 13..15 are SID padding.
+# Choosing a non-tail-aligned layout (i.e. not /56 with c=0) makes sure
+# the test exercises the offset-based extraction rather than a
+# "last 5 bytes" shortcut.
+#
+# Args.Mob.Session is laid out as (RFC 9433 Section 6.1, Figure 8 -- 40 bits):
+# QFI (6) | R (1) | U (1) | PDU Session ID (32)
+#
+# The test crafts an IPv6 packet whose destination address encodes
+#
+# IPv4 DA = 10.0.0.2 (gnb)
+# QFI = 5
+# PDU Session ID = 0x123 (= the GTP-U TEID, 32 bits)
+#
+# Args.Mob.Session bytes are therefore 14 00 00 01 23 (top byte is the
+# QFI byte (5 << 2) = 0x14, next four bytes are the 32-bit TEID). With
+# the /32-locator placement the SID ends up as
+# 2001:db8:a00:2:1400:1:2300:0 .
+# The expected output is an IPv4/UDP/GTP-U(long)/PDU-Session-ext packet with
+# TEID 0x00000123 and QFI 5.
+#
+# The IPv6 source address layout per RFC 9433 Section 6.6 Figure 10:
+#
+# | Source srupf Prefix (P bits) | IPv4 SA (a bits) | padding |
+#
+# is exercised in two scenarios:
+# - Default (no v6_src_prefix_len attribute): P = 64, IPv4 SA at
+# IPv6 bytes 8..11.
+# - Explicit v6_src_prefix_len 48: IPv4 SA at IPv6 bytes 6..9, with
+# a 6-byte Source srupf Prefix and a 6-byte trailing padding region.
+
+source lib.sh
+
+readonly TIMEOUT=4
+tcpdump_pid=""
+have_vrf=0
+
+cleanup()
+{
+ if [ -n "$tcpdump_pid" ]; then
+ kill "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ fi
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns srupf srgw gnb gnb_vrf
+
+ ip -n "$srgw" link set dev lo up
+ ip -n "$srupf" link set dev lo up
+ ip -n "$gnb" link set dev lo up
+ ip -n "$gnb_vrf" link set dev lo up
+
+ # upf <-> srgw (IPv6). Two srupf addresses encode the same
+ # IPv4 SA (10.0.0.1) at different byte offsets, exercising the
+ # default /64 and an explicit /48 Source srupf Prefix layout:
+ # 2001:db8:1::a00:1:0:1 -> IPv4 SA at IPv6 bytes 8..11 (P = 64)
+ # 2001:db8:3:a00:1::1 -> IPv4 SA at IPv6 bytes 6..9 (P = 48)
+ # The srgw peer addresses are placed on the same IPv6 /64 prefix
+ # as the srupf side so the srupf routes can name them as on-link
+ # next-hops without explicit neighbor discovery.
+ ip link add veth-n9 netns "$srupf" type veth peer name veth-n9-srgw \
+ netns "$srgw"
+ ip -n "$srupf" addr add 2001:db8:1::a00:1:0:1/64 dev veth-n9 nodad
+ ip -n "$srupf" addr add 2001:db8:3:a00:1::1/64 dev veth-n9 nodad
+ ip -n "$srgw" addr add 2001:db8:1::2/64 dev veth-n9-srgw nodad
+ ip -n "$srgw" addr add 2001:db8:3:a00:1::2/64 dev veth-n9-srgw nodad
+ ip -n "$srupf" link set dev veth-n9 up
+ ip -n "$srgw" link set dev veth-n9-srgw up
+
+ # srgw <-> gnb (IPv4)
+ ip link add veth-n3 netns "$srgw" type veth peer name veth-n3-gnb \
+ netns "$gnb"
+ ip -n "$srgw" addr add 10.0.0.1/24 dev veth-n3
+ ip -n "$gnb" addr add 10.0.0.2/24 dev veth-n3-gnb
+ ip -n "$srgw" link set dev veth-n3 up
+ ip -n "$gnb" link set dev veth-n3-gnb up
+
+ # allow forwarding on srgw
+ ip netns exec "$srgw" sysctl -wq net.ipv4.ip_forward=1
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ # routes on upf toward the End.M.GTP4.E locators
+ ip -n "$srupf" -6 route add 2001:db8::/32 via 2001:db8:1::2
+ ip -n "$srupf" -6 route add 2001:db9::/32 via 2001:db8:3:a00:1::2
+ ip -n "$srupf" -6 route add 2001:dbb::/32 via 2001:db8:1::2
+
+ # install End.M.GTP4.E on srgw with PDU Session Container (5G N3:
+ # pdu_type dl), default /64 Source srupf Prefix
+ ip -n "$srgw" -6 route add 2001:db8::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db8:1::2 v4_mask_len 32 pdu_type dl \
+ dev veth-n3
+
+ # install End.M.GTP4.E on srgw with PDU Session Container and
+ # explicit v6_src_prefix_len 48
+ ip -n "$srgw" -6 route add 2001:db9::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db9::1 v4_mask_len 32 v6_src_prefix_len 48 \
+ pdu_type dl \
+ dev veth-n3
+
+ # install End.M.GTP4.E on srgw WITHOUT pdu_type: short GTPv1-U
+ # (LTE-style, no PDU Session Container) regardless of QFI
+ ip -n "$srgw" -6 route add 2001:dbb::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db8:1::2 v4_mask_len 32 \
+ dev veth-n3
+
+ # Per-route VRF case: a second egress IPv4 path in its own VRF
+ # (e.g. modelling a second tenant on a different interface). The
+ # End.M.GTP4.E SID for this tenant binds the egress IPv4 lookup to
+ # the VRF via the standard seg6_local 'oif' attribute; without
+ # it, the lookup would fall through to the main table where the
+ # 10.0.1.0/24 prefix does not exist.
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ modprobe vrf 2>/dev/null
+ if ip -n "$srgw" link add vrf-n3 type vrf table 100 2>/dev/null; then
+ have_vrf=1
+ ip -n "$srgw" link set dev vrf-n3 up
+
+ ip link add veth-n3-2 netns "$srgw" type veth peer name \
+ veth-n3-2-gnb netns "$gnb_vrf"
+ ip -n "$srgw" link set dev veth-n3-2 master vrf-n3
+ ip -n "$srgw" addr add 10.0.1.1/24 dev veth-n3-2
+ ip -n "$gnb_vrf" addr add 10.0.1.2/24 dev veth-n3-2-gnb
+ ip -n "$srgw" link set dev veth-n3-2 up
+ ip -n "$gnb_vrf" link set dev veth-n3-2-gnb up
+
+ ip -n "$srupf" -6 route add 2001:dba::/32 via 2001:db8:1::2
+
+ ip -n "$srgw" -6 route add 2001:dba::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db8:1::2 v4_mask_len 32 oif vrf-n3 \
+ pdu_type dl \
+ dev veth-n3-2
+ fi
+}
+
+check_dependencies()
+{
+ if ! ip netns help 2>&1 | grep -q "exec"; then
+ echo "SKIP: ip netns exec not available"
+ exit "$ksft_skip"
+ fi
+
+ if ! command -v tcpdump >/dev/null; then
+ echo "SKIP: tcpdump is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! command -v ping >/dev/null; then
+ echo "SKIP: ping is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! command -v python3 >/dev/null; then
+ echo "SKIP: python3 is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "End.M.GTP4.E"; then
+ echo "SKIP: iproute2 too old, missing seg6local action End.M.GTP4.E"
+ exit "$ksft_skip"
+ fi
+
+ if ! python3 -c "import scapy.all" 2>/dev/null; then
+ echo "SKIP: python3-scapy is required"
+ exit "$ksft_skip"
+ fi
+}
+
+capture_traffic()
+{
+ local capture_ns="$1"
+ local capture_iface="$2"
+ local src="$3"
+ local sid="$4"
+ local out="$5"
+
+ # capture GTP-U traffic on the egress side. The capture is torn down
+ # by the explicit kill -INT below; the cleanup() trap only fires for
+ # unexpected exits.
+ ip netns exec "$capture_ns" tcpdump -U -nni "$capture_iface" -w "$out" \
+ 'udp port 2152' 2>/dev/null &
+ tcpdump_pid=$!
+ # Give tcpdump a brief moment to attach the BPF filter before we
+ # start sending traffic; tcpdump does not expose a "ready" signal.
+ sleep 1
+
+ # Send a single ICMPv6 echo-request to the End.M.GTP4.E SID.
+ ip netns exec "$srupf" ping -6 -c 1 -W "$TIMEOUT" -I "$src" "$sid" \
+ >/dev/null 2>&1
+
+ # stop tcpdump after the packet has had time to traverse
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+}
+
+run_test()
+{
+ local src="$1" # IPv6 SA the srupf must use
+ local sid="$2" # End.M.GTP4.E SID to ping
+ local expected_v4_src="$3" # expected IPv4 SA in the egress GTP-U
+ local capture_ns="${4:-$gnb}" # netns where GTP-U is expected to land
+ local capture_iface="${5:-veth-n3-gnb}"
+ local out
+ local rc
+
+ out=$(mktemp)
+ capture_traffic "$capture_ns" "$capture_iface" "$src" "$sid" "$out"
+
+ # Expected wire layout (verified via scapy field comparison rather
+ # than tcpdump -X | grep so the test is robust against tcpdump
+ # output formatting changes):
+ # IPv4 (src=$expected_v4_src, dst=10.0.0.2) | UDP(2152) |
+ # GTPv1 long (TEID=0x123, S/PN/E=001) |
+ # PDU Session ext (next=0x85, len=1, PDU type=DL=0, QFI=5) | inner T-PDU
+ EXPECTED_V4_SRC="$expected_v4_src" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IP, UDP
+
+expected_v4_src = os.environ['EXPECTED_V4_SRC']
+pkts = rdpcap(sys.argv[1])
+if not pkts:
+ sys.exit("no captured packets")
+
+found = False
+for p in pkts:
+ if not (IP in p and UDP in p):
+ continue
+ if p[UDP].dport != 2152:
+ continue
+ if p[IP].src != expected_v4_src:
+ sys.exit(f"unexpected IPv4 SA {p[IP].src}, want {expected_v4_src}")
+ payload = bytes(p[UDP].payload)
+ # GTP-U long header: flags(1)|mtype(1)|len(2)|teid(4)|seq(2)|npdu(1)|next(1)
+ if len(payload) < 12:
+ continue
+ teid = int.from_bytes(payload[4:8], 'big')
+ if teid != 0x00000123:
+ sys.exit(f"unexpected TEID 0x{teid:08x}, want 0x00000123")
+ next_ext = payload[11]
+ if next_ext != 0x85:
+ sys.exit(f"missing PDU Session ext (next={next_ext:#04x}, want 0x85)")
+ pdu_session = payload[12:16]
+ if pdu_session[0] != 0x01:
+ sys.exit(f"PDU Session ext_len {pdu_session[0]} != 1")
+ pdu_type = pdu_session[1] >> 4
+ qfi = pdu_session[2] & 0x3f
+ if pdu_type != 0:
+ sys.exit(f"PDU Type {pdu_type} != 0 (DL)")
+ if qfi != 5:
+ sys.exit(f"PDU Session QFI {qfi} != 5")
+ found = True
+ break
+
+if not found:
+ sys.exit("no IPv4/UDP/GTP-U packet observed")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify the short-GTPv1-U output produced when pdu_type is unset on the
+# route: 8-byte GTP-U header, no extension flag, no PDU Session
+# Container, regardless of the QFI extracted from Args.Mob.Session.
+run_test_short()
+{
+ local src="$1"
+ local sid="$2"
+ local expected_v4_src="$3"
+ local out
+ local rc
+
+ out=$(mktemp)
+ capture_traffic "$gnb" "veth-n3-gnb" "$src" "$sid" "$out"
+
+ EXPECTED_V4_SRC="$expected_v4_src" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IP, UDP
+
+expected_v4_src = os.environ['EXPECTED_V4_SRC']
+pkts = rdpcap(sys.argv[1])
+if not pkts:
+ sys.exit("no captured packets")
+
+found = False
+for p in pkts:
+ if not (IP in p and UDP in p):
+ continue
+ if p[UDP].dport != 2152:
+ continue
+ if p[IP].src != expected_v4_src:
+ sys.exit(f"unexpected IPv4 SA {p[IP].src}, want {expected_v4_src}")
+ payload = bytes(p[UDP].payload)
+ if len(payload) < 8:
+ continue
+ flags = payload[0]
+ # Short GTPv1-U: version=1, PT=1, no E/S/PN bits (0x30).
+ if flags != 0x30:
+ sys.exit(f"unexpected GTP-U flags {flags:#04x}, want 0x30 (short)")
+ teid = int.from_bytes(payload[4:8], 'big')
+ if teid != 0x00000123:
+ sys.exit(f"unexpected TEID 0x{teid:08x}, want 0x00000123")
+ found = True
+ break
+
+if not found:
+ sys.exit("no IPv4/UDP/GTP-U packet observed")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify that nf_hooks_lwtunnel=1 makes the inner T-PDU 5-tuple
+# visible to nftables on the SR Gateway. The inner T-PDU is IPv6
+# (ICMPv6 echo-request from the upf); the nft rule matches on its
+# IPv6 source address. DROP must suppress the GTP-U at the gnb,
+# ACCEPT must let it through.
+run_nf_test()
+{
+ local verdict="$1" # drop | accept
+ local expect="$2" # 1 if GTP-U expected, empty otherwise
+ local src="2001:db8:1::a00:1:0:1"
+ local sid="2001:db8:a00:2:1400:1:2300:0"
+ local out
+
+ ip netns exec "$srgw" nft flush chain ip6 filter prerouting
+ ip netns exec "$srgw" nft add rule ip6 filter prerouting \
+ ip6 saddr "$src" "$verdict"
+
+ out=$(mktemp)
+ capture_traffic "$gnb" "veth-n3-gnb" "$src" "$sid" "$out"
+
+ if [ -n "$expect" ]; then
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IP, UDP
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IP in p and UDP in p and p[UDP].dport == 2152:
+ sys.exit(0)
+sys.exit("expected GTP-U packet not observed at gnb despite nft accept")
+PYEOF
+ else
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IP, UDP
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IP in p and UDP in p and p[UDP].dport == 2152:
+ sys.exit("GTP-U packet leaked to gnb despite nft drop on inner")
+sys.exit(0)
+PYEOF
+ fi
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+main()
+{
+ local rc=0
+
+ check_dependencies
+ setup
+
+ # Default /64 layout: IPv4 SA at IPv6 bytes 8..11.
+ if run_test "2001:db8:1::a00:1:0:1" "2001:db8:a00:2:1400:1:2300:0" \
+ "10.0.0.1"; then
+ echo "TEST: End.M.GTP4.E (default /64) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (default /64) [FAIL]"
+ rc=1
+ fi
+
+ # v6_src_prefix_len 48 layout: IPv4 SA at IPv6 bytes 6..9.
+ if run_test "2001:db8:3:a00:1::1" "2001:db9:a00:2:1400:1:2300:0" \
+ "10.0.0.1"; then
+ echo "TEST: End.M.GTP4.E (v6_src_prefix_len 48) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (v6_src_prefix_len 48) [FAIL]"
+ rc=1
+ fi
+
+ # pdu_type unset: emit short GTPv1-U with no PDU Session Container
+ # even though Args.Mob.Session encodes QFI=5. This is the LTE-only
+ # / S1-U style output.
+ if run_test_short "2001:db8:1::a00:1:0:1" \
+ "2001:dbb:a00:2:1400:1:2300:0" \
+ "10.0.0.1"; then
+ echo "TEST: End.M.GTP4.E (pdu_type unset, short header) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (pdu_type unset, short header) [FAIL]"
+ rc=1
+ fi
+
+ # VRF binding (per-tenant): egress IPv4 lookup goes through vrf-n3
+ # (table 100), where 10.0.1.0/24 lives. Without "oif vrf-n3" the
+ # main-table lookup would fall through; the GTP-U observed in
+ # gnb_vrf demonstrates the binding. SID 2001:dba:a00:102:14:0:123:0
+ # encodes IPv4 DA 10.0.1.2 + QFI=5 / TEID=0x123.
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ if [ "$have_vrf" = "1" ]; then
+ if run_test "2001:db8:1::a00:1:0:1" \
+ "2001:dba:a00:102:1400:1:2300:0" \
+ "10.0.0.1" "$gnb_vrf" "veth-n3-2-gnb"; then
+ echo "TEST: End.M.GTP4.E (oif vrf-n3) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (oif vrf-n3) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP4.E (oif vrf-n3) [SKIP] (CONFIG_NET_VRF not loaded)"
+ fi
+
+ # Inner T-PDU netfilter hook: only meaningful when nft is present
+ # and the kernel exposes net.netfilter.nf_hooks_lwtunnel.
+ if command -v nft >/dev/null && \
+ ip netns exec "$srgw" sysctl -wq \
+ net.netfilter.nf_hooks_lwtunnel=1 2>/dev/null; then
+ ip netns exec "$srgw" nft add table ip6 filter
+ ip netns exec "$srgw" nft 'add chain ip6 filter prerouting' \
+ '{ type filter hook prerouting priority 0; }'
+
+ if run_nf_test drop ""; then
+ echo "TEST: End.M.GTP4.E (nft drop on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (nft drop on inner) [FAIL]"
+ rc=1
+ fi
+
+ if run_nf_test accept "1"; then
+ echo "TEST: End.M.GTP4.E (nft accept on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP4.E (nft accept on inner) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP4.E (inner-flow netfilter hook) [SKIP]" \
+ "(nft or nf_hooks_lwtunnel unavailable)"
+ fi
+
+ if [ "$rc" -eq 0 ]; then
+ echo "TEST: End.M.GTP4.E [PASS]"
+ exit "$ksft_pass"
+ else
+ echo "TEST: End.M.GTP4.E [FAIL]"
+ exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/7] seg6: add End.M.GTP6.E behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 1/7] seg6: add End.MAP behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 2/7] seg6: add End.M.GTP4.E behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 4/7] seg6: add End.M.GTP6.D behavior Yuya Kusakabe
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the End.M.GTP6.E behavior (RFC 9433 Section 6.5), the IPv6 dual
of End.M.GTP4.E. An End.M.GTP6.E SID always sits in the penultimate
position of an SR Policy (RFC 9433 Section 6.5 Notes); when it
becomes the active SID (segments_left == 1) the kernel pops the
IPv6/SRH outer, recovers TEID and QFI from the 40-bit
Args.Mob.Session field encoded in the locator-relative slice of the
SID, and re-encapsulates the inner T-PDU in IPv6/UDP/GTP-U toward
the next segment held in SRH[0].
The flow info, traffic class and hop limit are propagated from the
inbound IPv6 outer to the new outer (RFC 6040).
When net.netfilter.nf_hooks_lwtunnel=1, the inner T-PDU traverses
NF_INET_PRE_ROUTING between the SRv6 strip and the GTP-U push,
mirroring End.DX4 / End.DX6.
Configuration:
ip -6 route add 2001:db8:e::/64 \
encap seg6local action End.M.GTP6.E src 2001:db8:2::1 \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.5
Link: https://www.rfc-editor.org/rfc/rfc6040
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/uapi/linux/seg6_local.h | 2 +
net/ipv6/seg6_local.c | 312 ++++++++++++++++
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/srv6_end_m_gtp6_e_test.sh | 402 +++++++++++++++++++++
4 files changed, 717 insertions(+)
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index b42cb526bb81..8e46ede2980d 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -75,6 +75,8 @@ enum {
SEG6_LOCAL_ACTION_END_MAP = 17,
/* SRv6 to IPv4/GTP-U encap (RFC 9433 Section 6.6) */
SEG6_LOCAL_ACTION_END_M_GTP4_E = 18,
+ /* SRv6 to IPv6/GTP-U encap (RFC 9433 Section 6.5) */
+ SEG6_LOCAL_ACTION_END_M_GTP6_E = 19,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 4051fe89e6d1..4e5d138c3657 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -22,6 +22,7 @@
#include <linux/seg6.h>
#include <linux/seg6_local.h>
#include <net/addrconf.h>
+#include <net/ip6_checksum.h>
#include <net/ip6_route.h>
#include <net/dst_cache.h>
#include <net/ip_tunnels.h>
@@ -1622,6 +1623,21 @@ static unsigned int seg6_mobile_skb_prefix_bits(const struct sk_buff *skb)
return plen;
}
+/* Read Args.Mob.Session from @daddr right after a @prefix_bits-bit
+ * locator (RFC 9433 Section 6.5). Returns false if it would overflow.
+ */
+static bool seg6_mobile_extract_args_mob(const struct in6_addr *daddr,
+ unsigned int prefix_bits,
+ u64 *args_out)
+{
+ if (prefix_bits + SEG6_MOBILE_ARGS_MOB_LEN > 128)
+ return false;
+
+ *args_out = seg6_mobile_addr_get_bits(daddr->s6_addr, prefix_bits,
+ SEG6_MOBILE_ARGS_MOB_LEN);
+ return true;
+}
+
/* GTP-U PDU Session extension header (3GPP TS 38.415).
* 4-byte minimum unit: ext_len=1, PDU Type in high 4 bits of @pdu_type_spare,
* QFI in low 6 bits of @spare_qfi, next_ext=0.
@@ -1997,6 +2013,265 @@ static int input_action_end_m_gtp4_e(struct sk_buff *skb,
return -EINVAL;
}
+/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
+ * the inner T-PDU exposed by End.M.GTP6.E. After the outer SRv6 is
+ * popped the inner IP is briefly visible to netfilter; the finish
+ * half then builds the new IPv6/UDP/GTP-U outer using these fields.
+ */
+struct seg6_mobile_gtp6_e_cb {
+ struct in6_addr next_sid;
+ __be32 flowlabel;
+ u32 teid;
+ u8 qfi;
+ u8 tclass;
+ u8 hop_limit;
+ u8 pdu_type;
+ bool pdu_type_set;
+};
+
+#define SEG6_MOBILE_GTP6_E_CB(skb) \
+ ((struct seg6_mobile_gtp6_e_cb *)((skb)->cb))
+
+static int input_action_end_m_gtp6_e_finish(struct net *net,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_NOMEM;
+ struct seg6_mobile_gtp6_e_cb cb = *SEG6_MOBILE_GTP6_E_CB(skb);
+ struct dst_entry *orig_dst = skb_dst(skb);
+ const struct seg6_mobile_info *minfo;
+ struct seg6_local_lwt *slwt;
+ struct ipv6hdr *new_ip6h;
+ struct udphdr *uh;
+
+ slwt = seg6_local_lwtunnel(orig_dst->lwtstate);
+ minfo = &slwt->mobile_info;
+
+ /* Reject GSO packets that would not fit the egress IPv6/UDP/GTP-U
+ * path after our outer headers are added; the GSO segmenter cannot
+ * adjust mss across SRv6 -> GTP-U conversion. Skip the check
+ * entirely when no MTU is known on the current dst.
+ */
+ if (skb_is_gso(skb)) {
+ unsigned int ovhd = sizeof(*new_ip6h) + sizeof(*uh) +
+ sizeof(struct gtp1_header_long) +
+ sizeof(struct seg6_mobile_pdu_session_ext);
+ unsigned int mtu = dst_mtu(skb_dst(skb));
+
+ if (mtu && (mtu <= ovhd ||
+ !skb_gso_validate_network_len(skb, mtu - ovhd))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED;
+ goto drop;
+ }
+ }
+
+ /* Reserve worst-case headroom for the entire outer chain we are about
+ * to push: IPv6 + UDP + GTP-U long header + PDU Session extension.
+ * Subsequent skb_cow_head() calls inside seg6_mobile_push_gtpu() then
+ * become no-ops.
+ */
+ if (skb_cow_head(skb,
+ sizeof(*new_ip6h) + sizeof(*uh) +
+ sizeof(struct gtp1_header_long) +
+ sizeof(struct seg6_mobile_pdu_session_ext)))
+ goto drop;
+
+ if (seg6_mobile_push_gtpu(skb, cb.teid, cb.qfi, cb.pdu_type,
+ cb.pdu_type_set))
+ goto drop;
+
+ uh = skb_push(skb, sizeof(*uh));
+ skb_reset_transport_header(skb);
+ uh->source = htons(GTP1U_PORT);
+ uh->dest = htons(GTP1U_PORT);
+ uh->len = htons(skb->len);
+
+ new_ip6h = skb_push(skb, sizeof(*new_ip6h));
+ skb_reset_network_header(skb);
+ memset(new_ip6h, 0, sizeof(*new_ip6h));
+ ip6_flow_hdr(new_ip6h, cb.tclass, cb.flowlabel);
+ new_ip6h->payload_len = htons(skb->len - sizeof(*new_ip6h));
+ new_ip6h->nexthdr = IPPROTO_UDP;
+ new_ip6h->hop_limit = cb.hop_limit;
+ new_ip6h->saddr = minfo->src_addr;
+ new_ip6h->daddr = cb.next_sid;
+
+ /* RFC 8200 requires UDP/IPv6 checksums. Initialise the
+ * pseudo-header sum and let the stack/NIC complete it via
+ * CHECKSUM_PARTIAL so we do not pay a per-packet linear sum and
+ * we cooperate with offload.
+ */
+ skb->ip_summed = CHECKSUM_PARTIAL;
+ skb->csum_start = (unsigned char *)uh - skb->head;
+ skb->csum_offset = offsetof(struct udphdr, check);
+ uh->check = ~csum_ipv6_magic(&new_ip6h->saddr, &new_ip6h->daddr,
+ skb->len - sizeof(*new_ip6h),
+ IPPROTO_UDP, 0);
+
+ skb->protocol = htons(ETH_P_IPV6);
+ nf_reset_ct(skb);
+ skb_dst_drop(skb);
+
+ seg6_lookup_any_nexthop(skb, &cb.next_sid, 0, false, slwt->oif);
+ return dst_input(skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+/* RFC 9433 Section 6.5 -- End.M.GTP6.E
+ * Receives an SRv6 packet whose current SID is an End.M.GTP6.E SID
+ * (Segments Left == 1) and re-encapsulates the inner payload in
+ * IPv6/UDP/GTP-U (with an optional PDU Session extension header
+ * carrying the QFI) toward the next segment held in SRH[0].
+ *
+ * When net.netfilter.nf_hooks_lwtunnel=1 and the inner is a valid
+ * IPv4 / IPv6 packet, NF_INET_PRE_ROUTING fires on the bare inner
+ * T-PDU between the SRv6 strip and the GTP-U push.
+ */
+static int input_action_end_m_gtp6_e(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_SID;
+ const struct seg6_mobile_info *minfo = &slwt->mobile_info;
+ struct seg6_mobile_gtp6_e_cb *cb;
+ struct in6_addr next_sid;
+ struct ipv6_sr_hdr *srh;
+ u8 hop_limit, tclass, qfi;
+ unsigned int outer_len;
+ struct ipv6hdr *ip6h;
+ int inner_nfproto;
+ __be32 flowlabel;
+ __be16 frag_off;
+ u64 args_mob;
+ u32 teid;
+ int off;
+ u8 nh;
+
+ BUILD_BUG_ON(sizeof(struct seg6_mobile_gtp6_e_cb) >
+ sizeof_field(struct sk_buff, cb));
+
+ /* End.M.GTP6.E SRH-S02 (RFC 9433 Section 6.5) mandates the SRH be
+ * present with segments_left == 1. Use the legacy seg6 helper
+ * that enforces "SRH present" + HMAC; seg6_mobile_get_validated_srh()
+ * tolerates SRH-less packets via its @missing out-parameter, which
+ * is the wrong semantic here.
+ */
+ srh = get_and_validate_srh(skb);
+ if (!srh) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+
+ /* RFC 9433 Section 6.5 SRH-S02: Segments Left MUST be 1 */
+ if (srh->segments_left != 1) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+
+ /* @ip6h is fresh: get_and_validate_srh() already pulled at least
+ * sizeof(struct ipv6hdr) via pskb_may_pull(), so ipv6_hdr(skb) here
+ * is valid even though pskb_may_pull() may have reallocated
+ * skb->head inside that call.
+ */
+ ip6h = ipv6_hdr(skb);
+
+ if (!seg6_mobile_extract_args_mob(&ip6h->daddr,
+ seg6_mobile_skb_prefix_bits(skb),
+ &args_mob))
+ goto drop;
+ teid = seg6_mobile_teid_from_args(args_mob);
+ qfi = seg6_mobile_qfi_from_args(args_mob);
+
+ /* SRH[0] is the next segment for the new GTP-U tunnel */
+ next_sid = srh->segments[0];
+
+ /* RFC 6040 outer-to-outer propagation: copy DSCP+ECN (tclass) and
+ * the flow label from the SRv6 outer to the new IPv6 outer. Use
+ * ip6_flowlabel() (not ip6_flowinfo()) so the tclass byte is
+ * supplied exactly once via the @tclass argument of ip6_flow_hdr().
+ */
+ flowlabel = ip6_flowlabel(ip6h);
+ tclass = ipv6_get_dsfield(ip6h);
+ hop_limit = ip6h->hop_limit;
+
+ /* RFC 9433 Section 6.5 upper-layer S02 mandates "Pop the IPv6
+ * header and all its extension headers". ipv6_skip_exthdr()
+ * walks every extension header (HBH/Routing/Dest-Opts/Fragment)
+ * so HBH-before-SRH and DOpts-after-SRH are handled too. The
+ * terminal next-header value also selects NFPROTO_IPV4 /
+ * NFPROTO_IPV6 for the NF_INET_PRE_ROUTING hook below.
+ */
+ nh = ip6h->nexthdr;
+ off = ipv6_skip_exthdr(skb, sizeof(*ip6h), &nh, &frag_off);
+ if (off < 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+ outer_len = off;
+
+ switch (nh) {
+ case IPPROTO_IPIP:
+ inner_nfproto = NFPROTO_IPV4;
+ break;
+ case IPPROTO_IPV6:
+ inner_nfproto = NFPROTO_IPV6;
+ break;
+ default:
+ inner_nfproto = -1;
+ break;
+ }
+
+ /* For inner IP traffic that may traverse NF_INET_PRE_ROUTING below,
+ * pull the full inner IP header into the linear area so a netfilter
+ * hook reading skb_transport_header() does not access stale data.
+ * Non-IP inner is forwarded as-is via the GTP-U T-PDU payload.
+ */
+ if (!pskb_may_pull(skb, outer_len + ((inner_nfproto == NFPROTO_IPV4) ?
+ sizeof(struct iphdr) :
+ (inner_nfproto == NFPROTO_IPV6) ?
+ sizeof(struct ipv6hdr) : 0))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb_pull_rcsum(skb, outer_len);
+ skb_reset_network_header(skb);
+ skb_reset_transport_header(skb);
+
+ cb = SEG6_MOBILE_GTP6_E_CB(skb);
+ cb->next_sid = next_sid;
+ cb->flowlabel = flowlabel;
+ cb->teid = teid;
+ cb->qfi = qfi;
+ cb->tclass = tclass;
+ cb->hop_limit = hop_limit;
+ cb->pdu_type = minfo->pdu_type;
+ cb->pdu_type_set = minfo->pdu_type_set;
+
+ if (inner_nfproto >= 0 &&
+ static_branch_unlikely(&nf_hooks_lwtunnel_enabled)) {
+ skb->protocol = (inner_nfproto == NFPROTO_IPV4) ?
+ htons(ETH_P_IP) : htons(ETH_P_IPV6);
+ skb_set_transport_header(skb,
+ (inner_nfproto == NFPROTO_IPV4) ?
+ sizeof(struct iphdr) :
+ sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+ return NF_HOOK(inner_nfproto, NF_INET_PRE_ROUTING,
+ dev_net(skb->dev), NULL, skb, skb->dev,
+ NULL, input_action_end_m_gtp6_e_finish);
+ }
+
+ return input_action_end_m_gtp6_e_finish(dev_net(skb->dev), NULL, skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
/* RFC 9433 Section 6.2 -- End.MAP
* Replace the outer IPv6 destination address with the configured next
* SID, decrement the Hop Limit, and forward via IPv6 routing. The
@@ -2042,6 +2317,9 @@ static int input_action_end_map(struct sk_buff *skb,
static int seg6_mobile_v4_validate(struct seg6_local_lwt *slwt,
const void *cfg,
struct netlink_ext_ack *extack);
+static int seg6_mobile_gtp6_e_validate(struct seg6_local_lwt *slwt,
+ const void *cfg,
+ struct netlink_ext_ack *extack);
static struct seg6_action_desc seg6_action_table[] = {
{
@@ -2153,6 +2431,17 @@ static struct seg6_action_desc seg6_action_table[] = {
.build_state = seg6_mobile_v4_validate,
},
},
+ {
+ .action = SEG6_LOCAL_ACTION_END_M_GTP6_E,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR),
+ .optattrs = SEG6_F_LOCAL_COUNTERS |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_PDU_TYPE) |
+ SEG6_F_ATTR(SEG6_LOCAL_OIF),
+ .input = input_action_end_m_gtp6_e,
+ .slwt_ops = {
+ .build_state = seg6_mobile_gtp6_e_validate,
+ },
+ },
{
.action = SEG6_LOCAL_ACTION_END_MAP,
.attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
@@ -2645,6 +2934,29 @@ static int seg6_mobile_v4_validate(struct seg6_local_lwt *slwt,
return 0;
}
+/* End.M.GTP6.E SID layout (RFC 9433 Section 6.5):
+ *
+ * | locator (route prefix) | Args.Mob.Session (40) | pad |
+ *
+ * The locator length is the route's IPv6 destination prefix length.
+ * Reject route additions whose prefix leaves no room for the 40-bit
+ * Args.Mob.Session field at setup time so the operator gets a clear
+ * error from `ip route add` instead of silent per-packet drops.
+ */
+static int seg6_mobile_gtp6_e_validate(struct seg6_local_lwt *slwt,
+ const void *cfg,
+ struct netlink_ext_ack *extack)
+{
+ const struct fib6_config *fib6_cfg = cfg;
+
+ if ((unsigned int)fib6_cfg->fc_dst_len + SEG6_MOBILE_ARGS_MOB_LEN > 128) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "End.M.GTP6.E route prefix length must leave room for the 40-bit Args.Mob.Session (prefix_len <= 88)");
+ return -EINVAL;
+ }
+ return 0;
+}
+
#define MAX_PROG_NAME 256
static const struct nla_policy bpf_prog_policy[SEG6_LOCAL_BPF_PROG_MAX + 1] = {
[SEG6_LOCAL_BPF_PROG] = { .type = NLA_U32, },
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index db1b3ef48f19..01dafec5b60f 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -91,6 +91,7 @@ TEST_PROGS := \
srv6_end_dx6_netfilter_test.sh \
srv6_end_flavors_test.sh \
srv6_end_m_gtp4_e_test.sh \
+ srv6_end_m_gtp6_e_test.sh \
srv6_end_map_test.sh \
srv6_end_next_csid_l3vpn_test.sh \
srv6_end_x_next_csid_l3vpn_test.sh \
diff --git a/tools/testing/selftests/net/srv6_end_m_gtp6_e_test.sh b/tools/testing/selftests/net/srv6_end_m_gtp6_e_test.sh
new file mode 100755
index 000000000000..a0f7fb37db37
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_m_gtp6_e_test.sh
@@ -0,0 +1,402 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 End.M.GTP6.E behavior (RFC 9433 Section 6.5).
+#
+# +-------+ 2001:db8:1::/64 +-------+ 2001:db8:2::/64 +-------+
+# | srupf | ------------------- | srgw | ------------------- | gnb |
+# +-------+ veth-n9 +-------+ veth-n3 +-------+
+#
+# srupf is the SR-domain-side SRv6-aware UPF (RFC 9433 sense, not a
+# 3GPP UPF) that injects the SRv6 packets, gnb is the GTP-U-side
+# test peer, and srgw runs the End.M.GTP6.E behavior under test.
+#
+# An End.M.GTP6.E SID is installed on srgw for locator
+# 2001:db8:f::/64 with src=2001:db8:2::1. Args.Mob.Session is the
+# fixed 40-bit field defined by RFC 9433 Section 6.1, Figure 8, immediately
+# after the locator (here at byte offset 8). The bytes after
+# Args.Mob.Session are SID padding and are ignored by the egress.
+# The srupf uses scapy to inject an SRv6 packet with:
+#
+# outer DA = 2001:db8:f::1400:1:2300:0
+# (locator 2001:db8:f::/64 followed by
+# Args.Mob.Session bytes 14 00 00 01 23 at
+# offset 8, which encode QFI=5 and
+# PDU Session ID=0x123, plus 24 bits of
+# SID padding)
+# SRH segments[0] = 2001:db8:2::2 (gNB, next destination)
+# SRH segments[1] = 2001:db8:f::1400:1:2300:0 (current SID)
+# SRH segments_left = 1
+#
+# The expected output on veth-n3-gnb is an IPv6/UDP/GTP-U(long)/PDU-Session-ext
+# packet toward 2001:db8:2::2 carrying TEID 0x00000123 and QFI 5.
+
+source lib.sh
+
+readonly TIMEOUT=4
+
+tcpdump_pid=""
+have_vrf=0
+
+cleanup()
+{
+ if [ -n "$tcpdump_pid" ]; then
+ kill "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ fi
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns srupf srgw gnb gnb_vrf
+
+ ip -n "$srupf" link set lo up
+ ip -n "$srgw" link set lo up
+ ip -n "$gnb" link set lo up
+ ip -n "$gnb_vrf" link set lo up
+
+ ip link add veth-n9 netns "$srupf" type veth peer name veth-n9-srgw \
+ netns "$srgw"
+ ip -n "$srupf" addr add 2001:db8:1::1/64 dev veth-n9 nodad
+ ip -n "$srgw" addr add 2001:db8:1::2/64 dev veth-n9-srgw nodad
+ ip -n "$srupf" link set veth-n9 up
+ ip -n "$srgw" link set veth-n9-srgw up
+
+ ip link add veth-n3 netns "$srgw" type veth peer name veth-n3-gnb \
+ netns "$gnb"
+ ip -n "$srgw" addr add 2001:db8:2::1/64 dev veth-n3 nodad
+ ip -n "$gnb" addr add 2001:db8:2::2/64 dev veth-n3-gnb nodad
+ ip -n "$srgw" link set veth-n3 up
+ ip -n "$gnb" link set veth-n3-gnb up
+
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ # install End.M.GTP6.E on srgw with PDU Session Container (5G N3:
+ # pdu_type dl), /64 locator.
+ ip -n "$srgw" -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.M.GTP6.E \
+ src 2001:db8:2::1 pdu_type dl \
+ dev veth-n3
+
+ # install End.M.GTP6.E on srgw WITHOUT pdu_type: short GTPv1-U
+ # (LTE-style, no PDU Session Container) regardless of QFI.
+ ip -n "$srgw" -6 route add 2001:db8:fa::/64 \
+ encap seg6local action End.M.GTP6.E \
+ src 2001:db8:2::1 \
+ dev veth-n3
+
+ # Per-route VRF case: a second egress IPv6 path in its own VRF so we
+ # can verify that the End.M.GTP6.E SID's egress GTP-U lookup uses
+ # the configured 'oif' rather than the main routing table.
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ modprobe vrf 2>/dev/null
+ if ip -n "$srgw" link add vrf-n3 type vrf table 100 2>/dev/null; then
+ have_vrf=1
+ ip -n "$srgw" link set dev vrf-n3 up
+
+ ip link add veth-n3-2 netns "$srgw" type veth peer name \
+ veth-n3-2-gnb netns "$gnb_vrf"
+ ip -n "$srgw" link set dev veth-n3-2 master vrf-n3
+ ip -n "$srgw" addr add 2001:db8:3::1/64 dev veth-n3-2 nodad
+ ip -n "$gnb_vrf" addr add 2001:db8:3::2/64 dev veth-n3-2-gnb nodad
+ ip -n "$srgw" link set dev veth-n3-2 up
+ ip -n "$gnb_vrf" link set dev veth-n3-2-gnb up
+
+ ip -n "$srgw" -6 route add 2001:db8:e::/64 \
+ encap seg6local action End.M.GTP6.E \
+ src 2001:db8:3::1 oif vrf-n3 pdu_type dl \
+ dev veth-n3-2
+ fi
+}
+
+check_dependencies()
+{
+ if ! command -v tcpdump >/dev/null; then
+ echo "SKIP: tcpdump is required"; exit "$ksft_skip"
+ fi
+ if ! command -v python3 >/dev/null; then
+ echo "SKIP: python3 is required"; exit "$ksft_skip"
+ fi
+ if ! python3 -c "from scapy.layers.inet6 import IPv6ExtHdrSegmentRouting" 2>/dev/null; then
+ echo "SKIP: python3-scapy with SRv6 support is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "End.M.GTP6.E"; then
+ echo "SKIP: iproute2 too old, missing seg6local action End.M.GTP6.E"
+ exit "$ksft_skip"
+ fi
+}
+
+inject_srv6()
+{
+ local sid="$1" # outer IPv6 DA (current End.M.GTP6.E SID)
+ local next_seg="$2" # SRH segments[0] (next destination = gNB)
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n9-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" SID="$sid" NEXT_SEG="$next_seg" \
+ ip netns exec "$srupf" python3 - <<'PY'
+import os
+from scapy.all import IPv6, ICMPv6EchoRequest, sendp, Ether
+from scapy.layers.inet6 import IPv6ExtHdrSegmentRouting
+
+mac = os.environ['SRGW_MAC']
+sid = os.environ['SID']
+next_seg = os.environ['NEXT_SEG']
+inner = IPv6(src='2001:db8:1::1', dst='2001:db8:dead::1') / \
+ ICMPv6EchoRequest(data=b'X' * 16)
+srh = IPv6ExtHdrSegmentRouting(
+ addresses=[next_seg, sid],
+ segleft=1, lastentry=1, nh=41)
+pkt = Ether(dst=mac) / \
+ IPv6(src='2001:db8:1::1', dst=sid, nh=43) / \
+ srh / inner
+sendp(pkt, iface='veth-n9', verbose=False)
+PY
+}
+
+capture_traffic()
+{
+ local capture_ns="$1"
+ local capture_iface="$2"
+ local sid="$3"
+ local next_seg="$4"
+ local out="$5"
+
+ ip netns exec "$capture_ns" tcpdump -U -nni "$capture_iface" -w "$out" \
+ 'ip6 and udp port 2152' 2>/dev/null &
+ tcpdump_pid=$!
+ # Give tcpdump a brief moment to attach the BPF filter.
+ sleep 1
+
+ inject_srv6 "$sid" "$next_seg"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+}
+
+run_test()
+{
+ local sid="$1" # End.M.GTP6.E SID to send to
+ local next_seg="$2" # expected outer IPv6 DA in egress GTP-U
+ local capture_ns="${3:-$gnb}" # netns where GTP-U is expected to land
+ local capture_iface="${4:-veth-n3-gnb}"
+ local out
+
+ out=$(mktemp)
+ capture_traffic "$capture_ns" "$capture_iface" "$sid" "$next_seg" "$out"
+
+ # Verify with scapy field comparison: the captured frame must be
+ # IPv6/UDP(2152)/GTP-U toward $next_seg, carry TEID 0x00000123 and a
+ # PDU Session ext with QFI=5.
+ NEXT_SEG="$next_seg" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, UDP
+
+next_seg = os.environ['NEXT_SEG']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and UDP in p):
+ continue
+ if str(p[IPv6].dst) != next_seg:
+ continue
+ if p[UDP].dport != 2152:
+ continue
+ payload = bytes(p[UDP].payload)
+ if len(payload) < 12:
+ continue
+ teid = int.from_bytes(payload[4:8], 'big')
+ if teid != 0x00000123:
+ sys.exit(f"unexpected TEID 0x{teid:08x}, want 0x00000123")
+ if payload[11] != 0x85:
+ sys.exit(f"missing PDU Session ext (next={payload[11]:#04x}, want 0x85)")
+ pdu_session = payload[12:16]
+ if pdu_session[0] != 0x01 or (pdu_session[2] & 0x3f) != 5:
+ sys.exit(f"PDU Session fields unexpected: {pdu_session.hex()} (want 01 ?? 05 00)")
+ sys.exit(0)
+sys.exit(f"no IPv6/UDP/GTP-U packet observed toward {next_seg}")
+PYEOF
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify the short-GTPv1-U output produced when pdu_type is unset on the
+# route: 8-byte GTP-U header, no extension flag, no PDU Session
+# Container, regardless of the QFI extracted from Args.Mob.Session.
+run_test_short()
+{
+ local sid="$1"
+ local next_seg="$2"
+ local out
+ local rc
+
+ out=$(mktemp)
+ capture_traffic "$gnb" "veth-n3-gnb" "$sid" "$next_seg" "$out"
+
+ NEXT_SEG="$next_seg" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, UDP
+
+next_seg = os.environ['NEXT_SEG']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and UDP in p):
+ continue
+ if str(p[IPv6].dst) != next_seg:
+ continue
+ if p[UDP].dport != 2152:
+ continue
+ payload = bytes(p[UDP].payload)
+ if len(payload) < 8:
+ continue
+ flags = payload[0]
+ if flags != 0x30:
+ sys.exit(f"unexpected GTP-U flags {flags:#04x}, want 0x30 (short)")
+ teid = int.from_bytes(payload[4:8], 'big')
+ if teid != 0x00000123:
+ sys.exit(f"unexpected TEID 0x{teid:08x}, want 0x00000123")
+ sys.exit(0)
+sys.exit(f"no IPv6/UDP/GTP-U packet observed toward {next_seg}")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify that nf_hooks_lwtunnel=1 makes the inner T-PDU 5-tuple
+# visible to nftables on the SR Gateway. The inner is IPv6
+# (2001:db8:1::1 -> 2001:db8:dead::1, set by inject_srv6()); the nft
+# rule matches on its IPv6 source address. DROP must suppress the
+# GTP-U at the gnb, ACCEPT must let it through.
+run_nf_test()
+{
+ local verdict="$1" # drop | accept
+ local expect="$2" # 1 if GTP-U expected, empty otherwise
+ local sid="2001:db8:f::1400:1:2300:0"
+ local next_seg="2001:db8:2::2"
+ local out
+
+ ip netns exec "$srgw" nft flush chain ip6 filter prerouting
+ ip netns exec "$srgw" nft add rule ip6 filter prerouting \
+ ip6 saddr 2001:db8:1::1 "$verdict"
+
+ out=$(mktemp)
+ capture_traffic "$gnb" "veth-n3-gnb" "$sid" "$next_seg" "$out"
+
+ if [ -n "$expect" ]; then
+ NEXT_SEG="$next_seg" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, UDP
+
+next_seg = os.environ['NEXT_SEG']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p and UDP in p and \
+ str(p[IPv6].dst) == next_seg and p[UDP].dport == 2152:
+ sys.exit(0)
+sys.exit("expected GTP-U packet not observed at gnb despite nft accept")
+PYEOF
+ else
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IPv6, UDP
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p and UDP in p and p[UDP].dport == 2152:
+ sys.exit("GTP-U packet leaked to gnb despite nft drop on inner")
+sys.exit(0)
+PYEOF
+ fi
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+main()
+{
+ local rc=0
+
+ check_dependencies
+ setup
+
+ if run_test "2001:db8:f::1400:1:2300:0" "2001:db8:2::2"; then
+ echo "TEST: End.M.GTP6.E (default) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.E (default) [FAIL]"
+ rc=1
+ fi
+
+ # pdu_type unset: emit short GTPv1-U with no PDU Session Container
+ # even though Args.Mob.Session encodes QFI=5.
+ if run_test_short "2001:db8:fa::1400:1:2300:0" "2001:db8:2::2"; then
+ echo "TEST: End.M.GTP6.E (pdu_type unset, short header) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.E (pdu_type unset, short header) [FAIL]"
+ rc=1
+ fi
+
+ # VRF binding: egress IPv6 GTP-U goes through vrf-n3 (table 100),
+ # where the route to 2001:db8:3::/64 lives. Without "oif vrf-n3"
+ # the main-table lookup would fall through; the GTP-U observed in
+ # gnb_vrf demonstrates the binding.
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ if [ "$have_vrf" = "1" ]; then
+ if run_test "2001:db8:e::1400:1:2300:0" "2001:db8:3::2" \
+ "$gnb_vrf" "veth-n3-2-gnb"; then
+ echo "TEST: End.M.GTP6.E (oif vrf-n3) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.E (oif vrf-n3) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.E (oif vrf-n3) [SKIP] (CONFIG_NET_VRF not loaded)"
+ fi
+
+ # Inner T-PDU netfilter hook: only meaningful when nft is present
+ # and the kernel exposes net.netfilter.nf_hooks_lwtunnel.
+ if command -v nft >/dev/null && \
+ ip netns exec "$srgw" sysctl -wq \
+ net.netfilter.nf_hooks_lwtunnel=1 2>/dev/null; then
+ ip netns exec "$srgw" nft add table ip6 filter
+ ip netns exec "$srgw" nft 'add chain ip6 filter prerouting' \
+ '{ type filter hook prerouting priority 0; }'
+
+ if run_nf_test drop ""; then
+ echo "TEST: End.M.GTP6.E (nft drop on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.E (nft drop on inner) [FAIL]"
+ rc=1
+ fi
+
+ if run_nf_test accept "1"; then
+ echo "TEST: End.M.GTP6.E (nft accept on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.E (nft accept on inner) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.E (inner-flow netfilter hook) [SKIP]" \
+ "(nft or nf_hooks_lwtunnel unavailable)"
+ fi
+
+ if [ "$rc" -eq 0 ]; then
+ echo "TEST: End.M.GTP6.E [PASS]"
+ exit "$ksft_pass"
+ else
+ echo "TEST: End.M.GTP6.E [FAIL]"
+ exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 4/7] seg6: add End.M.GTP6.D behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
` (2 preceding siblings ...)
2026-05-04 16:30 ` [PATCH v2 3/7] seg6: add End.M.GTP6.E behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 5/7] seg6: add End.M.GTP6.D.Di behavior Yuya Kusakabe
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the End.M.GTP6.D headend behavior (RFC 9433 Section 6.3), which
receives an IPv6/UDP/GTP-U packet matching a locally instantiated
End.M.GTP6.D SID and re-encapsulates the inner T-PDU in SRv6 using
the configured SR Policy. TEID and QFI are folded into the 40-bit
Args.Mob.Session field defined by RFC 9433 Section 6.1.
RFC 9433 Section 6.3 Step S08 specifies "Write in the SRH[0] the
Args.Mob.Session" for a single-SID SR Policy. When the SR Policy
contains more segments, the augmented SRH must reserve a leading
slot for the original outer destination D so that the downstream
End.M.GTP6.E (which Section 6.5 requires to sit at the penultimate
SID and Step S01 instructs to "Copy SRH[0] and D to buffer memory")
can rebuild the GTP-U tunnel. Args.Mob.Session is therefore stamped
into segments[1] (the End.M.GTP6.E SID's locator-relative tail).
The augmented SRH (slwt->srh + one extra leading slot) is built
once at build_state time and reused on every packet.
The new SEG6_LOCAL_MOBILE_SR_PREFIX_LEN attribute carries the
locator length used by the remote End.M.GTP6.E SID; it is required
because the SR Gateway has no way to discover the remote SID's
prefix length from the FIB on its own.
When net.netfilter.nf_hooks_lwtunnel=1, the inner T-PDU traverses
NF_INET_PRE_ROUTING between the GTP-U strip and the SRv6 push,
mirroring End.DX4 / End.DX6.
Inbound GTP-U packets are classified by message type (3GPP TS
29.281 Section 5.1). Only T-PDU (type 255) is encapsulated into
SRv6. Any other GTP-U message (Echo Request/Response, Error
Indication, ...) is forwarded unchanged via the lwtunnel's saved
orig_input so that a downstream peer that owns the GTP-U control
plane can process it.
Configuration:
ip -6 route add 2001:db8:f::/64 \
encap seg6local action End.M.GTP6.D \
srh segs 2001:db8:2::e \
src 2001:db8:2::1 \
sr_prefix_len 64 \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.3
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/uapi/linux/seg6_local.h | 3 +
net/ipv6/seg6_local.c | 512 +++++++++++++++++++++
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/srv6_end_m_gtp6_d_test.sh | 497 ++++++++++++++++++++
4 files changed, 1013 insertions(+)
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index 8e46ede2980d..7d3d3d245b47 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -33,6 +33,7 @@ enum {
SEG6_LOCAL_MOBILE_V4_MASK_LEN,
SEG6_LOCAL_MOBILE_PDU_TYPE,
SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN,
+ SEG6_LOCAL_MOBILE_SR_PREFIX_LEN,
__SEG6_LOCAL_MAX,
};
#define SEG6_LOCAL_MAX (__SEG6_LOCAL_MAX - 1)
@@ -77,6 +78,8 @@ enum {
SEG6_LOCAL_ACTION_END_M_GTP4_E = 18,
/* SRv6 to IPv6/GTP-U encap (RFC 9433 Section 6.5) */
SEG6_LOCAL_ACTION_END_M_GTP6_E = 19,
+ /* IPv6/GTP-U decap into SRv6 (RFC 9433 Section 6.3) */
+ SEG6_LOCAL_ACTION_END_M_GTP6_D = 20,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 4e5d138c3657..09e912e17df8 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -13,6 +13,7 @@
#include <linux/net.h>
#include <linux/module.h>
#include <net/ip.h>
+#include <net/ipv6.h>
#include <net/lwtunnel.h>
#include <net/netevent.h>
#include <net/netns/generic.h>
@@ -195,7 +196,9 @@ struct seg6_mobile_info {
u8 v4_mask_len; /* IPv4 portion length (bits) */
u8 pdu_type; /* PDU Type (0=downlink, 1=uplink) */
bool pdu_type_set; /* PDU Session Container enabled */
+ u8 sr_prefix_len; /* egress SR prefix length (bits) */
u8 v6_src_prefix_len; /* Source UPF Prefix length (bits) */
+ struct ipv6_sr_hdr *aug_srh; /* augmented SRH for End.M.GTP6.D{,.Di} */
};
#define SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT 64
@@ -1638,6 +1641,53 @@ static bool seg6_mobile_extract_args_mob(const struct in6_addr *daddr,
return true;
}
+/* Write @nbits of @val (top bits) into a 16-byte big-endian @addr at
+ * bit offset @bit_off, preserving surrounding bits. Caller ensures
+ * bit_off + nbits <= 128 and 1 <= nbits <= 64.
+ */
+static void seg6_mobile_addr_set_bits(u8 *addr, unsigned int bit_off,
+ unsigned int nbits, u64 val)
+{
+ u64 hi = get_unaligned_be64(addr);
+ u64 lo = get_unaligned_be64(addr + 8);
+ u64 mask_hi, mask_lo;
+
+ val &= GENMASK_ULL(63, 64 - nbits);
+
+ if (bit_off >= 64) {
+ mask_lo = GENMASK_ULL(63, 64 - nbits) >> (bit_off - 64);
+ lo = (lo & ~mask_lo) | (val >> (bit_off - 64));
+ } else if (bit_off + nbits <= 64) {
+ mask_hi = GENMASK_ULL(63, 64 - nbits) >> bit_off;
+ hi = (hi & ~mask_hi) | (val >> bit_off);
+ } else {
+ unsigned int hi_bits = 64 - bit_off;
+
+ mask_hi = GENMASK_ULL(hi_bits - 1, 0);
+ mask_lo = GENMASK_ULL(63, 64 - (nbits - hi_bits));
+ hi = (hi & ~mask_hi) | (val >> bit_off);
+ lo = (lo & ~mask_lo) | ((val << hi_bits) & mask_lo);
+ }
+
+ put_unaligned_be64(hi, addr);
+ put_unaligned_be64(lo, addr + 8);
+}
+
+/* @prefix_bits is bounded to [1, 88] by parse_nla_mobile_sr_prefix_len()
+ * before this function is reached, so the guard below is unreachable
+ * today. Defense-in-depth against a future regression.
+ */
+static int seg6_mobile_write_args_mob(struct in6_addr *addr,
+ unsigned int prefix_bits, u64 args_mob)
+{
+ if (prefix_bits + SEG6_MOBILE_ARGS_MOB_LEN > 128)
+ return -EINVAL;
+
+ seg6_mobile_addr_set_bits(addr->s6_addr, prefix_bits,
+ SEG6_MOBILE_ARGS_MOB_LEN, args_mob);
+ return 0;
+}
+
/* GTP-U PDU Session extension header (3GPP TS 38.415).
* 4-byte minimum unit: ext_len=1, PDU Type in high 4 bits of @pdu_type_spare,
* QFI in low 6 bits of @spare_qfi, next_ext=0.
@@ -1665,6 +1715,16 @@ struct seg6_mobile_pdu_session_ext {
#define SEG6_MOBILE_ARGS_QFI_SHIFT 58
#define SEG6_MOBILE_ARGS_TEID_SHIFT 24
+/* Combine TEID and QFI into a left-justified Args.Mob.Session value
+ * (RFC 9433 Section 6.1 Figure 8); R/U are emitted as zero.
+ */
+static u64 seg6_mobile_args_from_teid_qfi(u32 teid, u8 qfi)
+{
+ return ((u64)(qfi & SEG6_MOBILE_PDU_SESSION_QFI_MASK) <<
+ SEG6_MOBILE_ARGS_QFI_SHIFT) |
+ ((u64)teid << SEG6_MOBILE_ARGS_TEID_SHIFT);
+}
+
static u8 seg6_mobile_qfi_from_args(u64 args_mob)
{
return (args_mob >> SEG6_MOBILE_ARGS_QFI_SHIFT) &
@@ -1723,6 +1783,121 @@ static int seg6_mobile_push_gtpu(struct sk_buff *skb, u32 teid, u8 qfi,
return 0;
}
+/* Parse the GTP-U header at @skb offset @gtp_off. Pulls each
+ * additional region (long header, extension chain) into the linear
+ * area as it walks; on success returns the total header length to
+ * consume (mandatory + optional + extension headers), or a negative
+ * errno on failure.
+ *
+ * Returns -EOPNOTSUPP if the packet is a well-formed GTPv1-U header
+ * that this code path does not consume itself (any non-T-PDU message
+ * such as Echo Request / Error Indication). Callers pass such packets
+ * through to the configured forwarding path via
+ * seg6_mobile_passthrough_non_tpdu().
+ *
+ * Returns -EINVAL when the GTP-U header is structurally malformed
+ * (truncated extension chain, ext_units == 0, etc.). Callers should
+ * drop those.
+ *
+ * On success, *@teid is set to the GTP-U TEID and *@qfi is set to the
+ * QFI found in a PDU Session extension header, or 0 if none is present.
+ *
+ * Callers must re-derive any pointers into @skb->data after this
+ * function returns: pskb_may_pull() may have reallocated skb->head.
+ */
+static int seg6_mobile_parse_gtpu(struct sk_buff *skb, unsigned int gtp_off,
+ u32 *teid, u8 *qfi)
+{
+ const struct gtp1_header *gtph;
+ const struct gtp1_header_long *gtphl;
+ const u8 *gtp;
+ unsigned int hdrlen;
+ u8 flags, next;
+
+ if (!pskb_may_pull(skb, gtp_off + sizeof(*gtph)))
+ return -EINVAL;
+ gtp = skb->data + gtp_off;
+ gtph = (const struct gtp1_header *)gtp;
+ flags = gtph->flags;
+
+ /* Accept only GTPv1-U T-PDU (3GPP TS 29.281 Section 5.1). Other
+ * GTPv1-U message types (Echo Request/Response, Error Indication,
+ * ...) are dispatched separately by the caller.
+ */
+ if ((flags & ~GTP1_F_MASK) != SEG6_MOBILE_GTP1U_FLAGS_BASE)
+ return -EOPNOTSUPP;
+ if (gtph->type != GTP_TPDU)
+ return -EOPNOTSUPP;
+
+ *teid = ntohl(gtph->tid);
+ *qfi = 0;
+
+ if (!(flags & (GTP1_F_EXTHDR | GTP1_F_SEQ | GTP1_F_NPDU)))
+ return sizeof(*gtph);
+
+ if (!pskb_may_pull(skb, gtp_off + sizeof(*gtphl)))
+ return -EINVAL;
+ gtp = skb->data + gtp_off;
+ gtphl = (const struct gtp1_header_long *)gtp;
+ hdrlen = sizeof(*gtphl);
+
+ if (!(flags & GTP1_F_EXTHDR))
+ return hdrlen;
+
+ next = gtphl->next;
+ while (next != 0) {
+ unsigned int ext_units, ext_bytes;
+ const u8 *ext;
+
+ if (!pskb_may_pull(skb, gtp_off + hdrlen + 1))
+ return -EINVAL;
+ ext = skb->data + gtp_off + hdrlen;
+ ext_units = ext[0];
+ if (ext_units == 0)
+ return -EINVAL;
+
+ ext_bytes = ext_units * 4;
+ if (!pskb_may_pull(skb, gtp_off + hdrlen + ext_bytes))
+ return -EINVAL;
+ ext = skb->data + gtp_off + hdrlen;
+
+ if (next == SEG6_MOBILE_PDU_SESSION_NH) {
+ /* 3GPP TS 38.415: the PDU Session extension header
+ * is exactly 4 bytes long.
+ */
+ if (ext_bytes != 4)
+ return -EINVAL;
+ *qfi = ext[2] & SEG6_MOBILE_PDU_SESSION_QFI_MASK;
+ }
+
+ next = ext[ext_bytes - 1];
+ hdrlen += ext_bytes;
+ }
+
+ return hdrlen;
+}
+
+/* Pass a non-T-PDU GTP-U message (Echo, Error Indication, ...) through
+ * the configured forwarding path so that a downstream UPF (which owns
+ * the GTP-U control plane) can process it. The packet is delivered via
+ * the lwtunnel's saved orig_input -- ip6_forward for an IPv6 SID route
+ * or ip_forward for an IPv4 route -- which forwards using the existing
+ * skb_dst, reaching the UPF that lives in the L3 network behind the
+ * SRGW.
+ *
+ * @skb is consumed.
+ */
+static int seg6_mobile_passthrough_non_tpdu(struct sk_buff *skb)
+{
+ struct dst_entry *dst = skb_dst(skb);
+
+ if (dst && dst->lwtstate && dst->lwtstate->orig_input)
+ return dst->lwtstate->orig_input(skb);
+
+ kfree_skb_reason(skb, SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU);
+ return -EINVAL;
+}
+
/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
* the inner T-PDU exposed by End.M.GTP4.E. After the outer SRv6 has
* been popped the inner IP is briefly visible to netfilter; the
@@ -2120,6 +2295,289 @@ static int input_action_end_m_gtp6_e_finish(struct net *net,
return -EINVAL;
}
+/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
+ * the inner T-PDU exposed by End.M.GTP6.D / End.M.GTP6.D.Di. The
+ * outer IPv6/UDP/GTP-U is gone by the time the finish callback runs,
+ * but the SRH built in finish still needs the original outer DA and
+ * the Args.Mob.Session derived from TEID/QFI.
+ */
+struct seg6_mobile_gtp6_d_cb {
+ u64 args_mob;
+ struct in6_addr orig_dst;
+};
+
+#define SEG6_MOBILE_GTP6_D_CB(skb) \
+ ((struct seg6_mobile_gtp6_d_cb *)((skb)->cb))
+
+static int input_action_end_m_gtp6_d_finish(struct net *net,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct seg6_mobile_gtp6_d_cb cb = *SEG6_MOBILE_GTP6_D_CB(skb);
+ struct dst_entry *orig_dst = skb_dst(skb);
+ enum skb_drop_reason reason;
+ const struct seg6_mobile_info *minfo;
+ struct seg6_local_lwt *slwt;
+ struct ipv6_sr_hdr *new_srh;
+ int inner_proto;
+ int err;
+
+ slwt = seg6_local_lwtunnel(orig_dst->lwtstate);
+ minfo = &slwt->mobile_info;
+
+ inner_proto = (skb->protocol == htons(ETH_P_IP)) ? IPPROTO_IPIP
+ : IPPROTO_IPV6;
+
+ err = seg6_do_srh_encap(skb, minfo->aug_srh, inner_proto);
+ if (err) {
+ reason = (err == -ENOMEM) ? SKB_DROP_REASON_SEG6_MOBILE_NOMEM
+ : SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb->protocol = htons(ETH_P_IPV6);
+
+ new_srh = (struct ipv6_sr_hdr *)(skb_network_header(skb) +
+ sizeof(struct ipv6hdr));
+ new_srh->segments[0] = cb.orig_dst;
+ if (seg6_mobile_write_args_mob(&new_srh->segments[1],
+ minfo->sr_prefix_len, cb.args_mob)) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_SID;
+ goto drop;
+ }
+
+ ipv6_hdr(skb)->saddr = minfo->src_addr;
+
+ /* seg6_do_srh_encap() copied segments[first_segment] to the outer
+ * DA before Args.Mob.Session was stamped; refresh it.
+ */
+ ipv6_hdr(skb)->daddr = new_srh->segments[new_srh->first_segment];
+
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+ skb_dst_drop(skb);
+
+ seg6_lookup_any_nexthop(skb, NULL, 0, false, slwt->oif);
+ return dst_input(skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+/* RFC 9433 Section 6.3 -- End.M.GTP6.D
+ * Receives an IPv6/UDP/GTP-U packet matching a locally instantiated
+ * End.M.GTP6.D SID and re-encapsulates the inner T-PDU in SRv6 using
+ * the configured SR Policy. TEID and QFI are folded into
+ * Args.Mob.Session. Per RFC 9433 Section 6.5 ("End.M.GTP6.E SID MUST
+ * always be the penultimate SID"), Args.Mob.Session is encoded into
+ * segments[1] of the new SRH (the penultimate SID at the egress UPF)
+ * while segments[0] holds the original outer DA so that the egress
+ * has a real GTP-U destination after End.M.GTP6.E decap.
+ *
+ * When net.netfilter.nf_hooks_lwtunnel=1 the inner T-PDU is exposed
+ * to NF_INET_PRE_ROUTING after the GTP-U strip and before the SRv6
+ * push, mirroring End.DX4 / End.DX6. This lets nftables / conntrack
+ * apply policy on the inner 5-tuple at the SR Gateway.
+ */
+static int input_action_end_m_gtp6_d(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ unsigned int outer_len, inner_off;
+ int gtp_hdrlen, inner_proto, inner_nfproto;
+ struct in6_addr orig_dst;
+ u8 inner_first, qfi;
+ struct ipv6_sr_hdr *srh;
+ struct ipv6hdr *ip6h;
+ struct udphdr *uh;
+ u64 args_mob;
+ u32 teid;
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+
+ BUILD_BUG_ON(sizeof(struct seg6_mobile_gtp6_d_cb) >
+ sizeof_field(struct sk_buff, cb));
+
+ /* RFC 9433 Section 6.3 SRH-S01: drop if outer SRH carries
+ * SegmentsLeft != 0
+ */
+ srh = seg6_get_srh(skb, 0);
+ if (srh && srh->segments_left != 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+
+ if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ orig_dst = ip6h->daddr;
+
+ /* RFC 9433 Section 6.3 upper-layer S01-S11: dispatch on
+ * (NH == UDP && UDP dport == GTP-U); otherwise delegate to the
+ * regular End behaviour (S10-S11).
+ */
+ {
+ __be16 frag_off;
+ u8 nh = ip6h->nexthdr;
+ int upper_off;
+
+ upper_off = ipv6_skip_exthdr(skb, sizeof(*ip6h), &nh,
+ &frag_off);
+ if (upper_off < 0) {
+ /* Outer IPv6 ext-header walk failed; the GTP-U
+ * envelope below it is unreachable.
+ */
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ if (nh != IPPROTO_UDP)
+ return input_action_end(skb, slwt);
+
+ if (!pskb_may_pull(skb, upper_off + sizeof(*uh))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ uh = (struct udphdr *)((u8 *)ip6h + upper_off);
+ if (uh->dest != htons(GTP1U_PORT))
+ return input_action_end(skb, slwt);
+
+ gtp_hdrlen = seg6_mobile_parse_gtpu(skb,
+ upper_off + sizeof(*uh),
+ &teid, &qfi);
+ if (gtp_hdrlen == -EOPNOTSUPP)
+ return seg6_mobile_passthrough_non_tpdu(skb);
+ if (gtp_hdrlen < 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ outer_len = upper_off + sizeof(*uh) + gtp_hdrlen;
+ }
+
+ args_mob = seg6_mobile_args_from_teid_qfi(teid, qfi);
+
+ if (!pskb_may_pull(skb, outer_len + 1)) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ inner_off = outer_len;
+ inner_first = *((u8 *)skb->data + inner_off);
+ switch (inner_first >> 4) {
+ case 4:
+ inner_proto = IPPROTO_IPIP;
+ inner_nfproto = NFPROTO_IPV4;
+ break;
+ case 6:
+ inner_proto = IPPROTO_IPV6;
+ inner_nfproto = NFPROTO_IPV6;
+ break;
+ default:
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ if (!pskb_may_pull(skb, outer_len +
+ ((inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) : sizeof(struct ipv6hdr)))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb_pull_rcsum(skb, outer_len);
+ skb_reset_network_header(skb);
+
+ /* Set skb->protocol to match the inner header so that the
+ * NF_INET_PRE_ROUTING hook (and seg6_do_srh_encap() inside
+ * the finish half) see a coherent IPv4/IPv6 packet.
+ */
+ skb->protocol = (inner_proto == IPPROTO_IPIP) ? htons(ETH_P_IP)
+ : htons(ETH_P_IPV6);
+
+ skb_set_transport_header(skb,
+ (inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) :
+ sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+ SEG6_MOBILE_GTP6_D_CB(skb)->args_mob = args_mob;
+ SEG6_MOBILE_GTP6_D_CB(skb)->orig_dst = orig_dst;
+
+ if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled))
+ return NF_HOOK(inner_nfproto, NF_INET_PRE_ROUTING,
+ dev_net(skb->dev), NULL, skb, skb->dev,
+ NULL, input_action_end_m_gtp6_d_finish);
+
+ return input_action_end_m_gtp6_d_finish(dev_net(skb->dev), NULL, skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+/* Shared between End.M.GTP6.D and End.M.GTP6.D.Di -- both
+ * prepend a single leading slot to the user-configured SRH to leave
+ * room for the original outer DA at SRH[0]. End.M.GTP6.D writes
+ * Args.Mob.Session into segments[1] at runtime; End.M.GTP6.D.Di
+ * leaves segments[1+] as the user provided them.
+ */
+static int seg6_end_m_gtp6_d_aug_build(struct seg6_local_lwt *slwt,
+ const void *cfg,
+ struct netlink_ext_ack *extack)
+{
+ struct ipv6_sr_hdr *aug;
+ int orig_len, aug_len;
+
+ if (!slwt->srh) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "End.M.GTP6.D{,.Di} requires srh segs");
+ return -EINVAL;
+ }
+
+ /* The augmented SRH adds one extra leading slot, so its hdrlen
+ * field (u8) must still fit the +2-segment-equivalent encoding.
+ * Reject pathological srh inputs at setup time so that no
+ * silent overflow can produce an undersized aug->hdrlen and a
+ * subsequent OOB read in seg6_do_srh_encap().
+ */
+ if (slwt->srh->hdrlen > 253) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "End.M.GTP6.D{,.Di} srh too large to augment (max 126 segments)");
+ return -EINVAL;
+ }
+
+ orig_len = (slwt->srh->hdrlen + 1) << 3;
+ aug_len = orig_len + sizeof(struct in6_addr);
+
+ aug = kzalloc(aug_len, GFP_KERNEL);
+ if (!aug)
+ return -ENOMEM;
+
+ memcpy(aug, slwt->srh, sizeof(*aug));
+ aug->hdrlen = (aug_len >> 3) - 1;
+ aug->segments_left = slwt->srh->segments_left + 1;
+ aug->first_segment = slwt->srh->first_segment + 1;
+ /* segments[0] left zero; data path stamps the original outer
+ * DA into the in-skb copy after seg6_do_srh_encap().
+ */
+ memcpy(&aug->segments[1], &slwt->srh->segments[0],
+ orig_len - sizeof(*aug));
+
+ slwt->mobile_info.aug_srh = aug;
+ return 0;
+}
+
+static void seg6_end_m_gtp6_d_aug_destroy(struct seg6_local_lwt *slwt)
+{
+ kfree(slwt->mobile_info.aug_srh);
+ slwt->mobile_info.aug_srh = NULL;
+}
+
/* RFC 9433 Section 6.5 -- End.M.GTP6.E
* Receives an SRv6 packet whose current SID is an End.M.GTP6.E SID
* (Segments Left == 1) and re-encapsulates the inner payload in
@@ -2442,6 +2900,19 @@ static struct seg6_action_desc seg6_action_table[] = {
.build_state = seg6_mobile_gtp6_e_validate,
},
},
+ {
+ .action = SEG6_LOCAL_ACTION_END_M_GTP6_D,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_SRH) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SR_PREFIX_LEN),
+ .optattrs = SEG6_F_LOCAL_COUNTERS |
+ SEG6_F_ATTR(SEG6_LOCAL_OIF),
+ .input = input_action_end_m_gtp6_d,
+ .slwt_ops = {
+ .build_state = seg6_end_m_gtp6_d_aug_build,
+ .destroy_state = seg6_end_m_gtp6_d_aug_destroy,
+ },
+ },
{
.action = SEG6_LOCAL_ACTION_END_MAP,
.attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
@@ -2542,6 +3013,7 @@ static const struct nla_policy seg6_local_policy[SEG6_LOCAL_MAX + 1] = {
NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)),
[SEG6_LOCAL_MOBILE_V4_MASK_LEN] = { .type = NLA_U8 },
[SEG6_LOCAL_MOBILE_PDU_TYPE] = { .type = NLA_U8 },
+ [SEG6_LOCAL_MOBILE_SR_PREFIX_LEN] = { .type = NLA_U8 },
[SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN] = { .type = NLA_U8 },
};
@@ -2957,6 +3429,39 @@ static int seg6_mobile_gtp6_e_validate(struct seg6_local_lwt *slwt,
return 0;
}
+static int parse_nla_mobile_sr_prefix_len(struct nlattr **attrs,
+ struct seg6_local_lwt *slwt,
+ struct netlink_ext_ack *extack)
+{
+ u8 len = nla_get_u8(attrs[SEG6_LOCAL_MOBILE_SR_PREFIX_LEN]);
+
+ /* The SR locator must be non-zero and leave room for the 40-bit
+ * Args.Mob.Session that follows it (RFC 9433 Section 6.5/6.7).
+ */
+ if (len == 0 || len + SEG6_MOBILE_ARGS_MOB_LEN > 128) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile SR prefix length must be in 1..88 (leaving room for the 40-bit Args.Mob.Session)");
+ return -EINVAL;
+ }
+ slwt->mobile_info.sr_prefix_len = len;
+ return 0;
+}
+
+static int put_nla_mobile_sr_prefix_len(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ if (nla_put_u8(skb, SEG6_LOCAL_MOBILE_SR_PREFIX_LEN,
+ slwt->mobile_info.sr_prefix_len))
+ return -EMSGSIZE;
+ return 0;
+}
+
+static int cmp_nla_mobile_sr_prefix_len(struct seg6_local_lwt *a,
+ struct seg6_local_lwt *b)
+{
+ return a->mobile_info.sr_prefix_len != b->mobile_info.sr_prefix_len;
+}
+
#define MAX_PROG_NAME 256
static const struct nla_policy bpf_prog_policy[SEG6_LOCAL_BPF_PROG_MAX + 1] = {
[SEG6_LOCAL_BPF_PROG] = { .type = NLA_U32, },
@@ -3399,6 +3904,10 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
.put = put_nla_mobile_pdu_type,
.cmp = cmp_nla_mobile_pdu_type },
+ [SEG6_LOCAL_MOBILE_SR_PREFIX_LEN] = { .parse = parse_nla_mobile_sr_prefix_len,
+ .put = put_nla_mobile_sr_prefix_len,
+ .cmp = cmp_nla_mobile_sr_prefix_len },
+
[SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN] = { .parse = parse_nla_mobile_v6_src_prefix_len,
.put = put_nla_mobile_v6_src_prefix_len,
.cmp = cmp_nla_mobile_v6_src_prefix_len },
@@ -3727,6 +4236,9 @@ static int seg6_local_get_encap_size(struct lwtunnel_state *lwt)
if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_PDU_TYPE))
nlsize += nla_total_size(1);
+ if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SR_PREFIX_LEN))
+ nlsize += nla_total_size(1);
+
if (attrs & SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN))
nlsize += nla_total_size(1);
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 01dafec5b60f..242195d7a8d8 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -91,6 +91,7 @@ TEST_PROGS := \
srv6_end_dx6_netfilter_test.sh \
srv6_end_flavors_test.sh \
srv6_end_m_gtp4_e_test.sh \
+ srv6_end_m_gtp6_d_test.sh \
srv6_end_m_gtp6_e_test.sh \
srv6_end_map_test.sh \
srv6_end_next_csid_l3vpn_test.sh \
diff --git a/tools/testing/selftests/net/srv6_end_m_gtp6_d_test.sh b/tools/testing/selftests/net/srv6_end_m_gtp6_d_test.sh
new file mode 100755
index 000000000000..deba76e683c1
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_m_gtp6_d_test.sh
@@ -0,0 +1,497 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 End.M.GTP6.D behavior (RFC 9433 Section 6.3).
+#
+# +-------+ 2001:db8:1::/64 +-------+ 2001:db8:2::/64 +-------+
+# | gnb | ------------------- | srgw | ------------------- | srupf |
+# +-------+ veth-n3 +-------+ veth-n9 +-------+
+# |
+# | 2001:db8:6::/64
+# +--------veth-n6--------- +-------+
+# | lupf |
+# +-------+
+#
+# gnb is the GTP-U-side test peer that injects the GTP-U packets.
+# srupf is the SR-domain-side SRv6-aware UPF (RFC 9433 sense, not
+# a 3GPP UPF) that receives the resulting SRv6 T-PDU. lupf is the
+# SRv6-non-aware legacy UPF that owns the GTP-U control plane and
+# receives non-T-PDU GTP-U (Echo Request, Error Indication, ...)
+# forwarded by srgw via the H.M.GTP6.D route's dev. srgw runs the
+# End.M.GTP6.D behavior under test.
+#
+# An End.M.GTP6.D SID is installed on srgw for locator
+# 2001:db8:f::/48 with src=2001:db8:2::1. Args.Mob.Session is the
+# fixed 40-bit field defined by RFC 9433 Section 6.1, Figure 8. When gnb sends an
+# IPv6/UDP/GTP-U packet to 2001:db8:f::1 carrying TEID 0x123, the srgw
+# is expected to emit an SRv6 packet toward 2001:db8:3::e whose last
+# SRH segment carries Args.Mob.Session in its right-aligned 40-bit
+# tail (QFI=5, R=0, U=0, PDU Session ID=0x123 → bytes 14 00 00 01 23).
+
+source lib.sh
+
+readonly TIMEOUT=4
+
+tcpdump_pid=""
+have_vrf=0
+
+cleanup()
+{
+ if [ -n "$tcpdump_pid" ]; then
+ kill "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ fi
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns gnb srgw srupf lupf srupf_vrf
+
+ ip -n "$gnb" link set lo up
+ ip -n "$srgw" link set lo up
+ ip -n "$srupf" link set lo up
+ ip -n "$lupf" link set lo up
+ ip -n "$srupf_vrf" link set lo up
+
+ # gnb <-> srgw
+ ip link add veth-n3 netns "$gnb" type veth peer name veth-n3-srgw \
+ netns "$srgw"
+ ip -n "$gnb" addr add 2001:db8:1::2/64 dev veth-n3 nodad
+ ip -n "$srgw" addr add 2001:db8:1::1/64 dev veth-n3-srgw nodad
+ ip -n "$gnb" link set veth-n3 up
+ ip -n "$srgw" link set veth-n3-srgw up
+
+ # srgw <-> srupf (SR-aware UPF, T-PDU SRv6 destination)
+ ip link add veth-n9 netns "$srgw" type veth peer name veth-n9-srupf \
+ netns "$srupf"
+ ip -n "$srgw" addr add 2001:db8:2::1/64 dev veth-n9 nodad
+ ip -n "$srupf" addr add 2001:db8:2::e/64 dev veth-n9-srupf nodad
+ ip -n "$srgw" link set veth-n9 up
+ ip -n "$srupf" link set veth-n9-srupf up
+
+ # srgw <-> lupf (legacy UPF, GTP-U control plane recipient)
+ ip link add veth-n6 netns "$srgw" type veth peer name veth-n6-lupf \
+ netns "$lupf"
+ ip -n "$srgw" addr add 2001:db8:6::1/64 dev veth-n6 nodad
+ ip -n "$lupf" addr add 2001:db8:6::e/64 dev veth-n6-lupf nodad
+ ip -n "$srgw" link set veth-n6 up
+ ip -n "$lupf" link set veth-n6-lupf up
+
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.forwarding=1
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.seg6_enabled=1
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.veth-n9.seg6_enabled=1
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.veth-n6.seg6_enabled=1
+ ip netns exec "$srupf" sysctl -wq net.ipv6.conf.all.seg6_enabled=1
+ ip netns exec "$srupf" sysctl -wq net.ipv6.conf.veth-n9-srupf.seg6_enabled=1
+
+ # route on gnb toward the End.M.GTP6.D SID
+ ip -n "$gnb" -6 route add 2001:db8:f::/64 via 2001:db8:1::1
+
+ # install End.M.GTP6.D on srgw. sr_prefix_len declares the locator
+ # length used by the remote End.M.GTP6.E SID; with /64 the kernel
+ # writes Args.Mob.Session into bytes 8..12 of the penultimate SID.
+ # dev veth-n6 is the legacy UPF leg: T-PDU encap takes the IPv6 SR
+ # Policy path (independent of dst.dev) while non-T-PDU is forwarded
+ # out veth-n6 via ip6_forward.
+ ip -n "$srgw" -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.M.GTP6.D \
+ srh segs 2001:db8:2::e,2001:db8:3::e \
+ src 2001:db8:2::1 sr_prefix_len 64 count \
+ dev veth-n6
+
+ # accept the SRv6 packet on srupf
+ ip -n "$srupf" -6 route add 2001:db8:2::e/128 dev lo
+ ip -n "$srupf" -6 route add 2001:db8:3::/64 dev veth-n9-srupf
+
+ # avoid ND-resolution timing flakiness with static neighbours
+ local srupf_mac srgw_n9_mac lupf_mac
+ srupf_mac=$(ip -n "$srupf" -j link show veth-n9-srupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ srgw_n9_mac=$(ip -n "$srgw" -j link show veth-n9 | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ lupf_mac=$(ip -n "$lupf" -j link show veth-n6-lupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:2::e dev veth-n9 \
+ lladdr "$srupf_mac" nud permanent
+ ip -n "$srupf" -6 neigh replace 2001:db8:2::1 dev veth-n9-srupf \
+ lladdr "$srgw_n9_mac" nud permanent
+ # Non-T-PDU passthrough: srgw forwards GTP-U control out the
+ # H.M.GTP6.D route's dev (veth-n6); pre-resolve the lupf neighbour
+ # for the Echo Request DA.
+ ip -n "$srgw" -6 neigh replace 2001:db8:f::1 dev veth-n6 \
+ lladdr "$lupf_mac" nud permanent
+
+ # Per-route VRF case: a second SR-side upf in its own VRF. The
+ # End.M.GTP6.D SID for this tenant binds the SRv6 underlay output to
+ # the VRF via 'oif'; without it the lookup would fall through to
+ # the main table. Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ modprobe vrf 2>/dev/null
+ if ip -n "$srgw" link add vrf-n9 type vrf table 100 2>/dev/null; then
+ have_vrf=1
+ ip -n "$srgw" link set dev vrf-n9 up
+
+ ip link add veth-n9-2 netns "$srgw" type veth peer name \
+ veth-n9-2-srupf netns "$srupf_vrf"
+ ip -n "$srgw" link set dev veth-n9-2 master vrf-n9
+ ip -n "$srgw" addr add 2001:db8:4::1/64 dev veth-n9-2 nodad
+ ip -n "$srupf_vrf" addr add 2001:db8:4::e/64 dev veth-n9-2-srupf \
+ nodad
+ ip -n "$srgw" link set dev veth-n9-2 up
+ ip -n "$srupf_vrf" link set dev veth-n9-2-srupf up
+
+ ip netns exec "$srgw" sysctl -wq \
+ net.ipv6.conf.veth-n9-2.seg6_enabled=1
+ ip netns exec "$srupf_vrf" sysctl -wq \
+ net.ipv6.conf.all.seg6_enabled=1
+ ip netns exec "$srupf_vrf" sysctl -wq \
+ net.ipv6.conf.veth-n9-2-srupf.seg6_enabled=1
+
+ ip -n "$gnb" -6 route add 2001:db8:f0::/64 via 2001:db8:1::1
+
+ ip -n "$srgw" -6 route add 2001:db8:f0::/64 \
+ encap seg6local action End.M.GTP6.D \
+ srh segs 2001:db8:4::e,2001:db8:5::e \
+ src 2001:db8:4::1 sr_prefix_len 64 count \
+ oif vrf-n9 \
+ dev veth-n9-2
+
+ ip -n "$srupf_vrf" -6 route add 2001:db8:4::e/128 dev lo
+ ip -n "$srupf_vrf" -6 route add 2001:db8:5::/64 dev veth-n9-2-srupf
+
+ local upf_vrf_mac
+ local srgw_e2_mac
+ upf_vrf_mac=$(ip -n "$srupf_vrf" -j link show \
+ veth-n9-2-srupf | python3 -c \
+ 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ srgw_e2_mac=$(ip -n "$srgw" -j link show veth-n9-2 | \
+ python3 -c \
+ 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:4::e dev veth-n9-2 \
+ lladdr "$upf_vrf_mac" nud permanent
+ ip -n "$srupf_vrf" -6 neigh replace 2001:db8:4::1 \
+ dev veth-n9-2-srupf lladdr "$srgw_e2_mac" nud permanent
+ fi
+}
+
+check_dependencies()
+{
+ if ! command -v tcpdump >/dev/null; then
+ echo "SKIP: tcpdump is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! command -v python3 >/dev/null; then
+ echo "SKIP: python3 is required"
+ exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "End.M.GTP6.D"; then
+ echo "SKIP: iproute2 too old, missing seg6local action End.M.GTP6.D"
+ exit "$ksft_skip"
+ fi
+
+ if ! python3 -c "import scapy.all" 2>/dev/null; then
+ echo "SKIP: python3-scapy is required"
+ exit "$ksft_skip"
+ fi
+}
+
+send_gtpu()
+{
+ local outer_dst="$1" # IPv6 destination of the GTP-U packet
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" OUTER_DST="$outer_dst" \
+ ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IPv6, UDP, IP, ICMP, sendp, Ether
+
+mac = os.environ['SRGW_MAC']
+outer_dst = os.environ['OUTER_DST']
+# GTPv1 long header (E bit set, next ext = 0x85 PDU Session) carrying
+# TEID 0x00000123, followed by a PDU Session ext (PDU Type=DL, QFI=5).
+gtpu = bytes.fromhex(
+ "34 ff 00 24 00 00 01 23 00 00 00 85" # long header
+ "01 00 05 00" # PDU Session ext
+)
+inner = bytes(IP(src="10.0.0.1", dst="10.0.0.2") / ICMP())
+pkt = (Ether(dst=mac) /
+ IPv6(src="2001:db8:1::2", dst=outer_dst) /
+ UDP(sport=2152, dport=2152) /
+ (gtpu + inner))
+sendp(pkt, iface="veth-n3", verbose=False)
+PY
+}
+
+# Send a GTPv1-U Echo Request; End.M.GTP6.D must NOT consume it but
+# pass it through to the configured forwarding path so the downstream
+# UPF (legacy GTP-U control plane) can answer. Verified by capturing
+# the unaltered Echo Request (type 0x01) on the upf side.
+send_gtpu_echo()
+{
+ local outer_dst="$1"
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" OUTER_DST="$outer_dst" \
+ ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IPv6, UDP, sendp, Ether
+mac = os.environ['SRGW_MAC']
+outer_dst = os.environ['OUTER_DST']
+gtpu_echo = bytes.fromhex("32 01 00 04 00 00 00 00 42 42 00 00")
+pkt = (Ether(dst=mac) /
+ IPv6(src="2001:db8:1::2", dst=outer_dst) /
+ UDP(sport=2152, dport=2152) /
+ gtpu_echo)
+sendp(pkt, iface="veth-n3", verbose=False)
+PY
+}
+
+run_echo_test()
+{
+ local outer_dst="$1"
+ local out
+ local rc
+
+ out=$(mktemp)
+
+ ip netns exec "$lupf" tcpdump -U -nni veth-n6-lupf -w "$out" \
+ 'udp port 2152' 2>/dev/null &
+ tcpdump_pid=$!
+ sleep 1
+
+ send_gtpu_echo "$outer_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+
+ OUTER_DST="$outer_dst" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, UDP
+
+want_dst = os.environ['OUTER_DST']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 not in p or UDP not in p:
+ continue
+ if p[UDP].sport != 2152 or p[UDP].dport != 2152:
+ continue
+ if p[IPv6].dst != want_dst:
+ continue
+ payload = bytes(p[UDP].payload)
+ if len(payload) >= 2 and payload[1] == 0x01:
+ sys.exit(0)
+sys.exit("no GTPv1-U Echo Request observed at lupf "
+ "(End.M.GTP6.D failed to pass non-T-PDU through)")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+capture_traffic()
+{
+ local capture_ns="$1"
+ local capture_iface="$2"
+ local outer_dst="$3"
+ local out="$4"
+
+ ip netns exec "$capture_ns" tcpdump -U -nni "$capture_iface" -w "$out" \
+ 'ip6' 2>/dev/null &
+ tcpdump_pid=$!
+ # Give tcpdump a brief moment to attach the BPF filter.
+ sleep 1
+
+ send_gtpu "$outer_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+}
+
+run_test()
+{
+ local outer_dst="$1" # GTP-U outer IPv6 DA
+ local expected_srh0="$2" # expected SRH[0] in upf
+ local capture_ns="${3:-$srupf}" # netns where SRv6 should land
+ local capture_iface="${4:-veth-n9-srupf}"
+ local out
+
+ out=$(mktemp)
+ capture_traffic "$capture_ns" "$capture_iface" "$outer_dst" "$out"
+
+ # Verify with scapy: an SRv6 packet (IPv6 + Routing Header type 4)
+ # must reach the upf. Per RFC 9433 Section 6.5 Note, SRH[1]
+ # carries Args.Mob.Session and SRH[0] carries the original outer DA.
+ EXPECTED_SRH0="$expected_srh0" python3 - "$out" <<'PYEOF'
+import ipaddress, os, sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+expected_srh0 = os.environ['EXPECTED_SRH0']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and IPv6ExtHdrSegmentRouting in p):
+ continue
+ srh = p[IPv6ExtHdrSegmentRouting]
+ if srh.type != 4:
+ sys.exit(f"unexpected RH type {srh.type}")
+ if len(srh.addresses) < 2:
+ continue
+ srh0 = ipaddress.IPv6Address(str(srh.addresses[0])).packed
+ if srh0 != ipaddress.IPv6Address(expected_srh0).packed:
+ sys.exit(f"SRH[0] = {ipaddress.IPv6Address(srh0)} "
+ f"(want {expected_srh0}, the preserved outer DA)")
+ srh1 = ipaddress.IPv6Address(str(srh.addresses[1])).packed
+ args = srh1[8:13]
+ if args != bytes.fromhex("1400000123"):
+ sys.exit(f"Args.Mob.Session = {args.hex()} (want 1400000123)")
+ sys.exit(0)
+sys.exit("no SRv6 (RT6 type=4) packet with 2+ segments observed at upf")
+PYEOF
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify that nf_hooks_lwtunnel=1 makes the inner T-PDU 5-tuple
+# visible to nftables on the SR Gateway. The nft rule matches on the
+# inner IPv4 source address (10.0.0.1, set by send_gtpu()); a DROP
+# verdict must prevent any SRv6 packet from reaching the upf, an
+# ACCEPT verdict must let it through unchanged.
+run_nf_test()
+{
+ local verdict="$1" # drop | accept
+ local expect_srh0="$2" # preserved-DA test, empty when no packet expected
+ local outer_dst="2001:db8:f::1"
+ local out
+
+ # fresh prerouting chain so each invocation starts clean
+ ip netns exec "$srgw" nft flush chain ip filter prerouting
+ ip netns exec "$srgw" nft add rule ip filter prerouting \
+ ip saddr 10.0.0.1 "$verdict"
+
+ out=$(mktemp)
+ capture_traffic "$srupf" "veth-n9-srupf" "$outer_dst" "$out"
+
+ if [ -n "$expect_srh0" ]; then
+ EXPECTED_SRH0="$expect_srh0" python3 - "$out" <<'PYEOF'
+import ipaddress, os, sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+expected_srh0 = os.environ['EXPECTED_SRH0']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and IPv6ExtHdrSegmentRouting in p):
+ continue
+ srh = p[IPv6ExtHdrSegmentRouting]
+ if len(srh.addresses) < 2:
+ continue
+ srh0 = ipaddress.IPv6Address(str(srh.addresses[0])).packed
+ if srh0 == ipaddress.IPv6Address(expected_srh0).packed:
+ sys.exit(0)
+sys.exit("expected SRv6 packet not observed at upf despite nft accept")
+PYEOF
+ else
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p and IPv6ExtHdrSegmentRouting in p:
+ sys.exit("SRv6 packet leaked to upf despite nft drop on inner")
+sys.exit(0)
+PYEOF
+ fi
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+main()
+{
+ local rc=0
+
+ check_dependencies
+ setup
+
+ if run_test "2001:db8:f::1" "2001:db8:f::1"; then
+ echo "TEST: End.M.GTP6.D (default) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D (default) [FAIL]"
+ rc=1
+ fi
+
+ if run_echo_test "2001:db8:f::1"; then
+ echo "TEST: End.M.GTP6.D (non-T-PDU passthrough) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D (non-T-PDU passthrough) [FAIL]"
+ rc=1
+ fi
+
+ # VRF binding: SRv6 underlay output goes through vrf-n9 (table 100).
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ if [ "$have_vrf" = "1" ]; then
+ if run_test "2001:db8:f0::1" "2001:db8:f0::1" \
+ "$srupf_vrf" "veth-n9-2-srupf"; then
+ echo "TEST: End.M.GTP6.D (oif vrf-n9) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D (oif vrf-n9) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.D (oif vrf-n9) [SKIP] (CONFIG_NET_VRF not loaded)"
+ fi
+
+ # Inner T-PDU netfilter hook: only meaningful when nft is present
+ # and the kernel exposes net.netfilter.nf_hooks_lwtunnel. The
+ # sysctl is one-way (cannot be cleared), but each test runs in a
+ # fresh netns so this is harmless.
+ if command -v nft >/dev/null && \
+ ip netns exec "$srgw" sysctl -wq \
+ net.netfilter.nf_hooks_lwtunnel=1 2>/dev/null; then
+ ip netns exec "$srgw" nft add table ip filter
+ ip netns exec "$srgw" nft \
+ 'add chain ip filter prerouting { type filter hook prerouting priority 0; }'
+
+ if run_nf_test drop ""; then
+ echo "TEST: End.M.GTP6.D (nft drop on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D (nft drop on inner) [FAIL]"
+ rc=1
+ fi
+
+ if run_nf_test accept "2001:db8:f::1"; then
+ echo "TEST: End.M.GTP6.D (nft accept on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D (nft accept on inner) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.D (inner-flow netfilter hook) [SKIP]" \
+ "(nft or nf_hooks_lwtunnel unavailable)"
+ fi
+
+ if [ "$rc" -eq 0 ]; then
+ echo "TEST: End.M.GTP6.D [PASS]"
+ exit "$ksft_pass"
+ else
+ echo "TEST: End.M.GTP6.D [FAIL]"
+ exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 5/7] seg6: add End.M.GTP6.D.Di behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
` (3 preceding siblings ...)
2026-05-04 16:30 ` [PATCH v2 4/7] seg6: add End.M.GTP6.D behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 6/7] seg6: add H.M.GTP4.D behavior Yuya Kusakabe
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the End.M.GTP6.D.Di drop-in mode variant of End.M.GTP6.D
(RFC 9433 Section 6.4). Unlike End.M.GTP6.D, the drop-in variant
does NOT fold the GTP-U identifiers into Args.Mob.Session: the
original outer IPv6 destination is preserved at SRH[0] of the new
SRH, so the destination side can keep the original address
untouched while still benefiting from SR Policy steering.
The augmented SRH builder/destroyer is shared with End.M.GTP6.D.
The TEID and QFI parsed out of the inbound GTP-U header are
intentionally discarded for this variant (matching RFC 9433
Section 6.4).
When net.netfilter.nf_hooks_lwtunnel=1, the inner T-PDU traverses
NF_INET_PRE_ROUTING between the GTP-U strip and the SRv6 push,
mirroring End.DX4 / End.DX6.
Non-T-PDU GTP-U messages are forwarded the same way as in
End.M.GTP6.D: passed through via the lwtunnel's saved orig_input
to a downstream peer that owns the GTP-U control plane.
Configuration:
ip -6 route add 2001:db8:f::/64 \
encap seg6local action End.M.GTP6.D.Di \
srh segs 2001:db8:2::e,2001:db8:3::e \
src 2001:db8:2::1 \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.4
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/uapi/linux/seg6_local.h | 2 +
net/ipv6/seg6_local.c | 222 +++++++++++
tools/testing/selftests/net/Makefile | 1 +
.../selftests/net/srv6_end_m_gtp6_d_di_test.sh | 427 +++++++++++++++++++++
4 files changed, 652 insertions(+)
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index 7d3d3d245b47..326da65ad5aa 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -80,6 +80,8 @@ enum {
SEG6_LOCAL_ACTION_END_M_GTP6_E = 19,
/* IPv6/GTP-U decap into SRv6 (RFC 9433 Section 6.3) */
SEG6_LOCAL_ACTION_END_M_GTP6_D = 20,
+ /* IPv6/GTP-U decap into SRv6, drop-in mode (RFC 9433 Section 6.4) */
+ SEG6_LOCAL_ACTION_END_M_GTP6_D_DI = 21,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 09e912e17df8..a6cd57ebcbde 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -2578,6 +2578,216 @@ static void seg6_end_m_gtp6_d_aug_destroy(struct seg6_local_lwt *slwt)
slwt->mobile_info.aug_srh = NULL;
}
+/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
+ * the inner T-PDU exposed by End.M.GTP6.D.Di. Only the original
+ * outer DA is needed in the finish half (it is stamped into SRH[0]
+ * after seg6_do_srh_encap()).
+ */
+struct seg6_mobile_gtp6_d_di_cb {
+ struct in6_addr orig_dst;
+};
+
+#define SEG6_MOBILE_GTP6_D_DI_CB(skb) \
+ ((struct seg6_mobile_gtp6_d_di_cb *)((skb)->cb))
+
+static int input_action_end_m_gtp6_d_di_finish(struct net *net,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct seg6_mobile_gtp6_d_di_cb cb = *SEG6_MOBILE_GTP6_D_DI_CB(skb);
+ struct dst_entry *orig_dst = skb_dst(skb);
+ enum skb_drop_reason reason;
+ const struct seg6_mobile_info *minfo;
+ struct seg6_local_lwt *slwt;
+ struct ipv6_sr_hdr *new_srh;
+ int inner_proto;
+ int err;
+
+ slwt = seg6_local_lwtunnel(orig_dst->lwtstate);
+ minfo = &slwt->mobile_info;
+
+ inner_proto = (skb->protocol == htons(ETH_P_IP)) ? IPPROTO_IPIP
+ : IPPROTO_IPV6;
+
+ err = seg6_do_srh_encap(skb, minfo->aug_srh, inner_proto);
+ if (err) {
+ reason = (err == -ENOMEM) ? SKB_DROP_REASON_SEG6_MOBILE_NOMEM
+ : SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb->protocol = htons(ETH_P_IPV6);
+
+ /* Stamp the prepended segments[0] (originally zeroed in
+ * minfo->aug_srh) with the saved original outer DA, in the
+ * in-skb SRH that seg6_do_srh_encap() just pushed.
+ */
+ new_srh = (struct ipv6_sr_hdr *)(skb_network_header(skb) +
+ sizeof(struct ipv6hdr));
+ new_srh->segments[0] = cb.orig_dst;
+
+ ipv6_hdr(skb)->saddr = minfo->src_addr;
+
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+ skb_dst_drop(skb);
+
+ seg6_lookup_any_nexthop(skb, NULL, 0, false, slwt->oif);
+ return dst_input(skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+/* RFC 9433 Section 6.4 -- End.M.GTP6.D.Di
+ * Drop-in interconnect variant of End.M.GTP6.D: instead of folding the
+ * GTP-U identifiers into Args.Mob.Session, the original outer IPv6 DA
+ * is preserved at SRH[0] so the destination side can keep the address
+ * untouched.
+ *
+ * When net.netfilter.nf_hooks_lwtunnel=1 the inner T-PDU is exposed
+ * to NF_INET_PRE_ROUTING after the GTP-U strip and before the SRv6
+ * push, identical to End.M.GTP6.D.
+ */
+static int input_action_end_m_gtp6_d_di(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ unsigned int outer_len, inner_off;
+ int gtp_hdrlen, inner_proto, inner_nfproto;
+ struct in6_addr orig_dst;
+ struct ipv6_sr_hdr *srh;
+ struct ipv6hdr *ip6h;
+ struct udphdr *uh;
+ u32 teid;
+ u8 inner_first, qfi;
+
+ BUILD_BUG_ON(sizeof(struct seg6_mobile_gtp6_d_di_cb) >
+ sizeof_field(struct sk_buff, cb));
+
+ srh = seg6_get_srh(skb, 0);
+ if (srh && srh->segments_left != 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_INVALID_SRH_SL;
+ goto drop;
+ }
+
+ if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ orig_dst = ip6h->daddr;
+
+ /* Same dispatch as End.M.GTP6.D (RFC 9433 Section 6.4 reuses
+ * the S01-S11 logic from Section 6.3): GTP-U traffic is
+ * decapsulated and re-encapsulated, anything else falls
+ * through to End.
+ */
+ {
+ __be16 frag_off;
+ u8 nh = ip6h->nexthdr;
+ int upper_off;
+
+ upper_off = ipv6_skip_exthdr(skb, sizeof(*ip6h), &nh,
+ &frag_off);
+ if (upper_off < 0) {
+ /* Outer IPv6 ext-header walk failed; the GTP-U
+ * envelope below it is unreachable.
+ */
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ if (nh != IPPROTO_UDP)
+ return input_action_end(skb, slwt);
+
+ if (!pskb_may_pull(skb, upper_off + sizeof(*uh))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ ip6h = ipv6_hdr(skb);
+ uh = (struct udphdr *)((u8 *)ip6h + upper_off);
+ if (uh->dest != htons(GTP1U_PORT))
+ return input_action_end(skb, slwt);
+
+ /* TEID/QFI are not consumed by the drop-in variant
+ * (RFC 9433 Section 6.4); seg6_mobile_parse_gtpu() is
+ * still required to compute the GTP-U header length so
+ * the outer chain (IPv6+UDP+GTP) can be popped correctly.
+ */
+ gtp_hdrlen = seg6_mobile_parse_gtpu(skb,
+ upper_off + sizeof(*uh),
+ &teid, &qfi);
+ if (gtp_hdrlen == -EOPNOTSUPP)
+ return seg6_mobile_passthrough_non_tpdu(skb);
+ if (gtp_hdrlen < 0) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+ (void)teid;
+ (void)qfi;
+
+ outer_len = upper_off + sizeof(*uh) + gtp_hdrlen;
+ }
+
+ if (!pskb_may_pull(skb, outer_len + 1)) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ inner_off = outer_len;
+ inner_first = *((u8 *)skb->data + inner_off);
+ switch (inner_first >> 4) {
+ case 4:
+ inner_proto = IPPROTO_IPIP;
+ inner_nfproto = NFPROTO_IPV4;
+ break;
+ case 6:
+ inner_proto = IPPROTO_IPV6;
+ inner_nfproto = NFPROTO_IPV6;
+ break;
+ default:
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ if (!pskb_may_pull(skb, outer_len +
+ ((inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) : sizeof(struct ipv6hdr)))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb_pull_rcsum(skb, outer_len);
+ skb_reset_network_header(skb);
+
+ skb->protocol = (inner_proto == IPPROTO_IPIP) ? htons(ETH_P_IP)
+ : htons(ETH_P_IPV6);
+
+ skb_set_transport_header(skb,
+ (inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) :
+ sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+ SEG6_MOBILE_GTP6_D_DI_CB(skb)->orig_dst = orig_dst;
+
+ if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled))
+ return NF_HOOK(inner_nfproto, NF_INET_PRE_ROUTING,
+ dev_net(skb->dev), NULL, skb, skb->dev,
+ NULL, input_action_end_m_gtp6_d_di_finish);
+
+ return input_action_end_m_gtp6_d_di_finish(dev_net(skb->dev), NULL,
+ skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
/* RFC 9433 Section 6.5 -- End.M.GTP6.E
* Receives an SRv6 packet whose current SID is an End.M.GTP6.E SID
* (Segments Left == 1) and re-encapsulates the inner payload in
@@ -2913,6 +3123,18 @@ static struct seg6_action_desc seg6_action_table[] = {
.destroy_state = seg6_end_m_gtp6_d_aug_destroy,
},
},
+ {
+ .action = SEG6_LOCAL_ACTION_END_M_GTP6_D_DI,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_SRH) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR),
+ .optattrs = SEG6_F_LOCAL_COUNTERS |
+ SEG6_F_ATTR(SEG6_LOCAL_OIF),
+ .input = input_action_end_m_gtp6_d_di,
+ .slwt_ops = {
+ .build_state = seg6_end_m_gtp6_d_aug_build,
+ .destroy_state = seg6_end_m_gtp6_d_aug_destroy,
+ },
+ },
{
.action = SEG6_LOCAL_ACTION_END_MAP,
.attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 242195d7a8d8..a770e711652e 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -91,6 +91,7 @@ TEST_PROGS := \
srv6_end_dx6_netfilter_test.sh \
srv6_end_flavors_test.sh \
srv6_end_m_gtp4_e_test.sh \
+ srv6_end_m_gtp6_d_di_test.sh \
srv6_end_m_gtp6_d_test.sh \
srv6_end_m_gtp6_e_test.sh \
srv6_end_map_test.sh \
diff --git a/tools/testing/selftests/net/srv6_end_m_gtp6_d_di_test.sh b/tools/testing/selftests/net/srv6_end_m_gtp6_d_di_test.sh
new file mode 100755
index 000000000000..81465b59c54a
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_m_gtp6_d_di_test.sh
@@ -0,0 +1,427 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 End.M.GTP6.D.Di drop-in behavior
+# (RFC 9433 Section 6.4).
+#
+# Topology mirrors srv6_end_m_gtp6_d_test.sh. The key difference is
+# that the End.M.GTP6.D.Di action preserves the original outer IPv6
+# destination address (here 2001:db8:f::dead) as the final SRH segment,
+# rather than folding GTP-U identifiers into Args.Mob.Session.
+
+source lib.sh
+
+readonly TIMEOUT=4
+
+tcpdump_pid=""
+have_vrf=0
+
+cleanup()
+{
+ if [ -n "$tcpdump_pid" ]; then
+ kill "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ fi
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns gnb srgw srupf lupf srupf_vrf
+
+ ip -n "$gnb" link set lo up
+ ip -n "$srgw" link set lo up
+ ip -n "$srupf" link set lo up
+ ip -n "$lupf" link set lo up
+ ip -n "$srupf_vrf" link set lo up
+
+ ip link add veth-n3 netns "$gnb" type veth peer name veth-n3-srgw \
+ netns "$srgw"
+ ip -n "$gnb" addr add 2001:db8:1::2/64 dev veth-n3 nodad
+ ip -n "$srgw" addr add 2001:db8:1::1/64 dev veth-n3-srgw nodad
+ ip -n "$gnb" link set veth-n3 up
+ ip -n "$srgw" link set veth-n3-srgw up
+
+ # srgw <-> srupf (SR-aware UPF, T-PDU SRv6 destination)
+ ip link add veth-n9 netns "$srgw" type veth peer name veth-n9-srupf \
+ netns "$srupf"
+ ip -n "$srgw" addr add 2001:db8:2::1/64 dev veth-n9 nodad
+ ip -n "$srupf" addr add 2001:db8:2::e/64 dev veth-n9-srupf nodad
+ ip -n "$srgw" link set veth-n9 up
+ ip -n "$srupf" link set veth-n9-srupf up
+
+ # srgw <-> lupf (legacy UPF, GTP-U control plane recipient)
+ ip link add veth-n6 netns "$srgw" type veth peer name veth-n6-lupf \
+ netns "$lupf"
+ ip -n "$srgw" addr add 2001:db8:6::1/64 dev veth-n6 nodad
+ ip -n "$lupf" addr add 2001:db8:6::e/64 dev veth-n6-lupf nodad
+ ip -n "$srgw" link set veth-n6 up
+ ip -n "$lupf" link set veth-n6-lupf up
+
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ local srupf_mac srgw_n9_mac lupf_mac
+ srupf_mac=$(ip -n "$srupf" -j link show veth-n9-srupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ srgw_n9_mac=$(ip -n "$srgw" -j link show veth-n9 | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ lupf_mac=$(ip -n "$lupf" -j link show veth-n6-lupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:2::e dev veth-n9 \
+ lladdr "$srupf_mac" nud permanent 2>/dev/null || true
+ ip -n "$srupf" -6 neigh replace 2001:db8:2::1 dev veth-n9-srupf \
+ lladdr "$srgw_n9_mac" nud permanent 2>/dev/null || true
+ # Non-T-PDU passthrough: pre-resolve the Echo Request DA so the
+ # srgw can hand the packet off to the legacy upf via veth-n6.
+ ip -n "$srgw" -6 neigh replace 2001:db8:f::dead dev veth-n6 \
+ lladdr "$lupf_mac" nud permanent 2>/dev/null || true
+
+ ip -n "$gnb" -6 route add 2001:db8:f::/64 via 2001:db8:1::1
+
+ # dev veth-n6 is the legacy UPF leg for non-T-PDU passthrough; T-PDU
+ # encap takes the IPv6 SR Policy path via a separate FIB lookup.
+ ip -n "$srgw" -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.M.GTP6.D.Di \
+ srh segs 2001:db8:2::e,2001:db8:3::e \
+ src 2001:db8:2::1 \
+ dev veth-n6
+
+ ip -n "$srupf" -6 route add 2001:db8:3::/64 dev veth-n9-srupf
+
+ # Per-route VRF case: a second SR-side upf in its own VRF. The
+ # End.M.GTP6.D.Di SID for this tenant binds the SRv6 underlay output
+ # to the VRF via 'oif'. Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ modprobe vrf 2>/dev/null
+ if ip -n "$srgw" link add vrf-n9 type vrf table 100 2>/dev/null; then
+ have_vrf=1
+ ip -n "$srgw" link set dev vrf-n9 up
+
+ ip link add veth-n9-2 netns "$srgw" type veth peer name \
+ veth-n9-2-srupf netns "$srupf_vrf"
+ ip -n "$srgw" link set dev veth-n9-2 master vrf-n9
+ ip -n "$srgw" addr add 2001:db8:4::1/64 dev veth-n9-2 nodad
+ ip -n "$srupf_vrf" addr add 2001:db8:4::e/64 dev veth-n9-2-srupf \
+ nodad
+ ip -n "$srgw" link set dev veth-n9-2 up
+ ip -n "$srupf_vrf" link set dev veth-n9-2-srupf up
+
+ local upf_vrf_mac srgw_e2_mac
+ upf_vrf_mac=$(ip -n "$srupf_vrf" -j link show \
+ veth-n9-2-srupf | python3 -c \
+ 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ srgw_e2_mac=$(ip -n "$srgw" -j link show veth-n9-2 | \
+ python3 -c \
+ 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:4::e dev veth-n9-2 \
+ lladdr "$upf_vrf_mac" nud permanent 2>/dev/null || true
+ ip -n "$srupf_vrf" -6 neigh replace 2001:db8:4::1 \
+ dev veth-n9-2-srupf lladdr "$srgw_e2_mac" nud permanent \
+ 2>/dev/null || true
+
+ ip -n "$gnb" -6 route add 2001:db8:f0::/64 via 2001:db8:1::1
+
+ ip -n "$srgw" -6 route add 2001:db8:f0::/64 \
+ encap seg6local action End.M.GTP6.D.Di \
+ srh segs 2001:db8:4::e,2001:db8:5::e \
+ src 2001:db8:4::1 oif vrf-n9 \
+ dev veth-n9-2
+
+ ip -n "$srupf_vrf" -6 route add 2001:db8:5::/64 \
+ dev veth-n9-2-srupf
+ fi
+}
+
+check_dependencies()
+{
+ if ! command -v tcpdump >/dev/null; then
+ echo "SKIP: tcpdump is required"; exit "$ksft_skip"
+ fi
+ if ! command -v python3 >/dev/null; then
+ echo "SKIP: python3 is required"; exit "$ksft_skip"
+ fi
+ if ! python3 -c "import scapy.all" 2>/dev/null; then
+ echo "SKIP: python3-scapy is required"; exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "End.M.GTP6.D.Di"; then
+ echo "SKIP: iproute2 too old, missing seg6local action End.M.GTP6.D.Di"
+ exit "$ksft_skip"
+ fi
+}
+
+send_gtpu()
+{
+ local outer_dst="$1"
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" OUTER_DST="$outer_dst" \
+ ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IPv6, UDP, IP, ICMP, sendp, Ether
+mac = os.environ['SRGW_MAC']
+outer_dst = os.environ['OUTER_DST']
+gtpu = bytes.fromhex(
+ "34 ff 00 24 00 00 01 23 00 00 00 85"
+ "01 00 05 00")
+inner = bytes(IP(src='10.0.0.1', dst='10.0.0.2') / ICMP())
+pkt = (Ether(dst=mac) /
+ IPv6(src='2001:db8:1::2', dst=outer_dst) /
+ UDP(sport=2152, dport=2152) /
+ (gtpu + inner))
+sendp(pkt, iface='veth-n3', verbose=False)
+PY
+}
+
+send_gtpu_echo()
+{
+ local outer_dst="$1"
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" OUTER_DST="$outer_dst" \
+ ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IPv6, UDP, sendp, Ether
+mac = os.environ['SRGW_MAC']
+outer_dst = os.environ['OUTER_DST']
+gtpu_echo = bytes.fromhex("32 01 00 04 00 00 00 00 42 42 00 00")
+pkt = (Ether(dst=mac) /
+ IPv6(src='2001:db8:1::2', dst=outer_dst) /
+ UDP(sport=2152, dport=2152) /
+ gtpu_echo)
+sendp(pkt, iface='veth-n3', verbose=False)
+PY
+}
+
+run_echo_test()
+{
+ local outer_dst="$1"
+ local out
+ local rc
+
+ out=$(mktemp)
+
+ ip netns exec "$lupf" tcpdump -U -nni veth-n6-lupf -w "$out" \
+ 'udp port 2152' 2>/dev/null &
+ tcpdump_pid=$!
+ sleep 1
+
+ send_gtpu_echo "$outer_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+
+ OUTER_DST="$outer_dst" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, UDP
+
+want_dst = os.environ['OUTER_DST']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 not in p or UDP not in p:
+ continue
+ if p[UDP].sport != 2152 or p[UDP].dport != 2152:
+ continue
+ if p[IPv6].dst != want_dst:
+ continue
+ payload = bytes(p[UDP].payload)
+ if len(payload) >= 2 and payload[1] == 0x01:
+ sys.exit(0)
+sys.exit("no GTPv1-U Echo Request observed at lupf "
+ "(End.M.GTP6.D.Di failed to pass non-T-PDU through)")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+capture_traffic()
+{
+ local capture_ns="$1"
+ local capture_iface="$2"
+ local outer_dst="$3"
+ local out="$4"
+
+ ip netns exec "$capture_ns" tcpdump -U -nni "$capture_iface" -w "$out" \
+ 'ip6' 2>/dev/null &
+ tcpdump_pid=$!
+ # Give tcpdump a brief moment to attach the BPF filter.
+ sleep 1
+
+ send_gtpu "$outer_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+}
+
+run_test()
+{
+ local outer_dst="$1" # GTP-U outer IPv6 DA
+ local capture_ns="${2:-$srupf}" # netns where SRv6 should land
+ local capture_iface="${3:-veth-n9-srupf}"
+ local out
+
+ out=$(mktemp)
+ capture_traffic "$capture_ns" "$capture_iface" "$outer_dst" "$out"
+
+ # scapy field check: an SRv6 (RT6 type=4) packet must reach upf
+ # and one of the SRH segments must contain the original outer DA
+ # (preserved by the drop-in variant).
+ OUTER_DST="$outer_dst" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+outer_dst = os.environ['OUTER_DST'].lower()
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and IPv6ExtHdrSegmentRouting in p):
+ continue
+ srh = p[IPv6ExtHdrSegmentRouting]
+ if srh.type != 4:
+ continue
+ addrs = [str(a).lower() for a in srh.addresses]
+ if outer_dst in addrs:
+ sys.exit(0)
+ sys.exit(f"original DA not in SRH segments: {addrs}")
+sys.exit("no SRv6 (RT6 type=4) packet observed")
+PYEOF
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify that nf_hooks_lwtunnel=1 makes the inner T-PDU 5-tuple
+# visible to nftables on the SR Gateway. The nft rule matches on the
+# inner IPv4 source address (10.0.0.1, set by send_gtpu()); a DROP
+# verdict must prevent any SRv6 packet from reaching the upf, an
+# ACCEPT verdict must let it through unchanged.
+run_nf_test()
+{
+ local verdict="$1" # drop | accept
+ local expect_da="$2" # preserved-DA address, empty when no packet expected
+ local outer_dst="2001:db8:f::dead"
+ local out
+
+ ip netns exec "$srgw" nft flush chain ip filter prerouting
+ ip netns exec "$srgw" nft add rule ip filter prerouting \
+ ip saddr 10.0.0.1 "$verdict"
+
+ out=$(mktemp)
+ capture_traffic "$srupf" "veth-n9-srupf" "$outer_dst" "$out"
+
+ if [ -n "$expect_da" ]; then
+ OUTER_DST="$expect_da" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+outer_dst = os.environ['OUTER_DST'].lower()
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if not (IPv6 in p and IPv6ExtHdrSegmentRouting in p):
+ continue
+ srh = p[IPv6ExtHdrSegmentRouting]
+ addrs = [str(a).lower() for a in srh.addresses]
+ if outer_dst in addrs:
+ sys.exit(0)
+sys.exit("expected SRv6 packet not observed at upf despite nft accept")
+PYEOF
+ else
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IPv6, IPv6ExtHdrSegmentRouting
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p and IPv6ExtHdrSegmentRouting in p:
+ sys.exit("SRv6 packet leaked to upf despite nft drop on inner")
+sys.exit(0)
+PYEOF
+ fi
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+main()
+{
+ local rc=0
+
+ check_dependencies
+ setup
+
+ if run_test "2001:db8:f::dead"; then
+ echo "TEST: End.M.GTP6.D.Di (default) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D.Di (default) [FAIL]"
+ rc=1
+ fi
+
+ if run_echo_test "2001:db8:f::dead"; then
+ echo "TEST: End.M.GTP6.D.Di (non-T-PDU passthrough) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D.Di (non-T-PDU passthrough) [FAIL]"
+ rc=1
+ fi
+
+ # VRF binding: SRv6 underlay output goes through vrf-n9 (table 100).
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ if [ "$have_vrf" = "1" ]; then
+ if run_test "2001:db8:f0::dead" "$srupf_vrf" "veth-n9-2-srupf"; then
+ echo "TEST: End.M.GTP6.D.Di (oif vrf-n9) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D.Di (oif vrf-n9) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.D.Di (oif vrf-n9) [SKIP] (CONFIG_NET_VRF not loaded)"
+ fi
+
+ # Inner T-PDU netfilter hook: only meaningful when nft is present
+ # and the kernel exposes net.netfilter.nf_hooks_lwtunnel.
+ if command -v nft >/dev/null && \
+ ip netns exec "$srgw" sysctl -wq \
+ net.netfilter.nf_hooks_lwtunnel=1 2>/dev/null; then
+ ip netns exec "$srgw" nft add table ip filter
+ ip netns exec "$srgw" nft \
+ 'add chain ip filter prerouting { type filter hook prerouting priority 0; }'
+
+ if run_nf_test drop ""; then
+ echo "TEST: End.M.GTP6.D.Di (nft drop on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D.Di (nft drop on inner) [FAIL]"
+ rc=1
+ fi
+
+ if run_nf_test accept "2001:db8:f::dead"; then
+ echo "TEST: End.M.GTP6.D.Di (nft accept on inner) [PASS]"
+ else
+ echo "TEST: End.M.GTP6.D.Di (nft accept on inner) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: End.M.GTP6.D.Di (inner-flow netfilter hook) [SKIP]" \
+ "(nft or nf_hooks_lwtunnel unavailable)"
+ fi
+
+ if [ "$rc" -eq 0 ]; then
+ echo "TEST: End.M.GTP6.D.Di [PASS]"
+ exit "$ksft_pass"
+ else
+ echo "TEST: End.M.GTP6.D.Di [FAIL]"
+ exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 6/7] seg6: add H.M.GTP4.D behavior
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
` (4 preceding siblings ...)
2026-05-04 16:30 ` [PATCH v2 5/7] seg6: add End.M.GTP6.D.Di behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 7/7] Documentation: networking: add seg6_mobile guide Yuya Kusakabe
2026-05-04 23:39 ` [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Jakub Kicinski
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Add the H.M.GTP4.D headend behavior (RFC 9433 Section 6.7), which
receives an IPv4/UDP/GTP-U packet on a configured IPv4 route and
re-encapsulates it in IPv6 + (optional) SRH toward an SR Gateway
running End.M.GTP4.E. The constructed End.M.GTP4.E SID encodes
the legacy IPv4 destination and the per-session arguments derived
from the GTP-U header so the egress can decapsulate it (RFC 9433
Section 6.6 Figure 9).
This is the only behavior in seg6_local that runs on AF_INET routes;
the rest has been IPv6-only. To support that, the seg6_action_desc
framework gains an explicit input_family field, the build_state
path now accepts AF_INET in addition to AF_INET6, and
seg6_local_input() switches to a NF_HOOK that uses the right
nfproto for the inbound packet.
PMTU is honored on the encap side: when the post-encap length
exceeds the egress MTU and the IPv4 outer carries DF, the kernel
sends an ICMP Fragmentation Needed back to the originator before
dropping; GSO packets that would not fit get dropped without a
notification because the GSO segmenter cannot fix this up after
the network protocol has changed from IPv4 to IPv6.
When net.netfilter.nf_hooks_lwtunnel=1, the inner T-PDU traverses
NF_INET_PRE_ROUTING between the GTP-U strip and the SRv6 push,
mirroring End.DX4 / End.DX6.
Non-T-PDU GTP-U messages are forwarded the same way as in
End.M.GTP6.D: passed through via the lwtunnel's saved orig_input
to a downstream peer that owns the GTP-U control plane.
Configuration:
ip -4 route add 10.99.0.0/24 \
encap seg6local action H.M.GTP4.D \
nh6 2001:db8:: \
src 2001:db8:2::1 \
v4_mask_len 32 sr_prefix_len 32 \
dev <dev>
Link: https://www.rfc-editor.org/rfc/rfc9433.html#section-6.7
Link: https://www.rfc-editor.org/rfc/rfc6040
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
include/uapi/linux/seg6_local.h | 2 +
net/ipv6/seg6_local.c | 422 +++++++++++++++++-
tools/testing/selftests/net/Makefile | 1 +
.../testing/selftests/net/srv6_h_m_gtp4_d_test.sh | 487 +++++++++++++++++++++
4 files changed, 909 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h
index 326da65ad5aa..e6bb57129fdc 100644
--- a/include/uapi/linux/seg6_local.h
+++ b/include/uapi/linux/seg6_local.h
@@ -82,6 +82,8 @@ enum {
SEG6_LOCAL_ACTION_END_M_GTP6_D = 20,
/* IPv6/GTP-U decap into SRv6, drop-in mode (RFC 9433 Section 6.4) */
SEG6_LOCAL_ACTION_END_M_GTP6_D_DI = 21,
+ /* SR headend: IPv4/GTP-U decap, encap in SRv6 (RFC 9433 Section 6.7) */
+ SEG6_LOCAL_ACTION_H_M_GTP4_D = 22,
__SEG6_LOCAL_ACTION_MAX,
};
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index a6cd57ebcbde..fe7799aeaa53 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -38,6 +38,7 @@
#include <linux/unaligned.h>
#include <net/gso.h>
#include <net/gtp.h>
+#include <net/icmp.h>
#define SEG6_F_ATTR(i) BIT(i)
@@ -52,6 +53,11 @@ struct seg6_local_lwtunnel_ops {
struct seg6_action_desc {
int action;
+ /* Address family of the FIB hook the route is installed on.
+ * Defaults to AF_INET6 when 0; entries that run on IPv4 routes
+ * (currently only H.M.GTP4.D) set this to AF_INET explicitly.
+ */
+ int input_family;
unsigned long attrs;
/* The optattrs field is used for specifying all the optional
@@ -2788,6 +2794,342 @@ static int input_action_end_m_gtp6_d_di(struct sk_buff *skb,
return -EINVAL;
}
+/* Overlay @v4 into @addr right after a @v6_src_prefix_len-bit prefix
+ * (default /64), per RFC 9433 Section 6.6 Figure 10.
+ */
+static void seg6_mobile_overlay_v4(struct in6_addr *addr, u8 v4_mask_len,
+ u8 v6_src_prefix_len, __be32 v4)
+{
+ u8 p_bits = v6_src_prefix_len ? : SEG6_MOBILE_V6_SRC_PREFIX_LEN_DEFAULT;
+ u8 sa_bits = min_t(u8, v4_mask_len, 32);
+ u64 v4_left;
+
+ if (!sa_bits || (unsigned int)p_bits + sa_bits > 128)
+ return;
+
+ v4_left = (u64)ntohl(v4) << 32;
+ seg6_mobile_addr_set_bits(addr->s6_addr, p_bits, sa_bits, v4_left);
+}
+
+/* Encode the IPv4 DA and Args.Mob.Session into @sid right after a
+ * @prefix_bits-bit locator, per RFC 9433 Section 6.7 Figure 11.
+ */
+static int seg6_mobile_fill_egress_sid(struct in6_addr *sid,
+ unsigned int prefix_bits,
+ u8 v4_mask_len, __be32 v4, u64 args)
+{
+ u8 sa_bits = min_t(u8, v4_mask_len, 32);
+ u64 v4_left;
+
+ if (prefix_bits + sa_bits + SEG6_MOBILE_ARGS_MOB_LEN > 128)
+ return -EINVAL;
+
+ if (sa_bits) {
+ v4_left = (u64)ntohl(v4) << 32;
+ seg6_mobile_addr_set_bits(sid->s6_addr, prefix_bits, sa_bits,
+ v4_left);
+ }
+
+ seg6_mobile_addr_set_bits(sid->s6_addr, prefix_bits + sa_bits,
+ SEG6_MOBILE_ARGS_MOB_LEN, args);
+ return 0;
+}
+
+/* Per-skb context preserved across the NF_INET_PRE_ROUTING hook on
+ * the inner T-PDU exposed by H.M.GTP4.D. The inbound IPv4 outer is
+ * gone by the time the finish half runs, but the new SRv6 outer
+ * still needs the constructed End.M.GTP4.E SID and the source IPv6
+ * address (both derived from the IPv4 outer, the SID, and TEID/QFI).
+ */
+struct seg6_mobile_h_gtp4_d_cb {
+ struct in6_addr new_da;
+ struct in6_addr new_sa;
+ u8 outer_tclass;
+};
+
+#define SEG6_MOBILE_H_GTP4_D_CB(skb) \
+ ((struct seg6_mobile_h_gtp4_d_cb *)((skb)->cb))
+
+static int input_action_h_m_gtp4_d_finish(struct net *net,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ struct seg6_mobile_h_gtp4_d_cb cb = *SEG6_MOBILE_H_GTP4_D_CB(skb);
+ struct dst_entry *orig_dst = skb_dst(skb);
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_NOMEM;
+ struct seg6_local_lwt *slwt;
+ struct ipv6_sr_hdr *new_srh;
+ struct ipv6hdr *new_ip6h;
+ int inner_proto;
+ int err;
+
+ slwt = seg6_local_lwtunnel(orig_dst->lwtstate);
+
+ inner_proto = (skb->protocol == htons(ETH_P_IP)) ? IPPROTO_IPIP
+ : IPPROTO_IPV6;
+
+ if (slwt->srh) {
+ struct ipv6hdr *outer_ip6h;
+
+ /* Multi-segment SR Policy: prepend ipv6 + SRH and
+ * overwrite the last segment with the constructed
+ * End.M.GTP4.E SID.
+ */
+ err = seg6_do_srh_encap(skb, slwt->srh, inner_proto);
+ if (err) {
+ if (err != -ENOMEM)
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb->protocol = htons(ETH_P_IPV6);
+
+ new_srh = (struct ipv6_sr_hdr *)(skb_network_header(skb) +
+ sizeof(struct ipv6hdr));
+ new_srh->segments[0] = cb.new_da;
+
+ /* seg6_do_srh_encap() zeroes the outer Traffic Class for
+ * IPv4 inners. Overwrite it with the RFC 6040 normal-mode
+ * value computed at the input half so the SR domain sees
+ * the inner DSCP/ECN.
+ */
+ outer_ip6h = ipv6_hdr(skb);
+ ipv6_change_dsfield(outer_ip6h, 0, cb.outer_tclass);
+
+ /* seg6_do_srh_encap() sets the outer daddr from
+ * segments[first_segment]. When first_segment == 0 the
+ * write above replaced that slot's content, so the outer
+ * daddr would still point at the user-provided segment
+ * value rather than the constructed End.M.GTP4.E SID.
+ * Re-read from segments[first_segment] after the write
+ * for correctness in that case (no-op when
+ * first_segment > 0).
+ */
+ ipv6_hdr(skb)->daddr =
+ new_srh->segments[new_srh->first_segment];
+ ipv6_hdr(skb)->saddr = cb.new_sa;
+
+ skb_set_transport_header(skb, sizeof(struct ipv6hdr));
+ } else {
+ /* Single-segment encap (no SRH): RFC 8754 Section 4.1
+ * allows omitting the SRH when there is exactly one
+ * segment.
+ */
+ if (skb_cow_head(skb, sizeof(*new_ip6h)))
+ goto drop;
+
+ new_ip6h = skb_push(skb, sizeof(*new_ip6h));
+ skb_reset_network_header(skb);
+ memset(new_ip6h, 0, sizeof(*new_ip6h));
+ /* RFC 6040 normal-mode propagation of inner DSCP/ECN. */
+ ip6_flow_hdr(new_ip6h, cb.outer_tclass, 0);
+ new_ip6h->payload_len = htons(skb->len - sizeof(*new_ip6h));
+ new_ip6h->nexthdr = inner_proto;
+ new_ip6h->hop_limit = IPV6_DEFAULT_HOPLIMIT;
+ new_ip6h->saddr = cb.new_sa;
+ new_ip6h->daddr = cb.new_da;
+ skb->protocol = htons(ETH_P_IPV6);
+ skb_set_transport_header(skb, sizeof(*new_ip6h));
+ }
+
+ nf_reset_ct(skb);
+ skb_dst_drop(skb);
+
+ seg6_lookup_any_nexthop(skb, NULL, 0, false, slwt->oif);
+ return dst_input(skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
+static int input_action_h_m_gtp4_d(struct sk_buff *skb,
+ struct seg6_local_lwt *slwt)
+{
+ unsigned int outer_len, inner_off;
+ struct in6_addr new_da, new_sa;
+ struct seg6_mobile_h_gtp4_d_cb *cb;
+ int gtp_hdrlen;
+ __be32 v4_da, v4_sa;
+ struct iphdr *ip4h;
+ __be16 frag_off;
+ struct udphdr *uh;
+ int inner_nfproto;
+ u8 inner_first;
+ u8 inner_dsfield;
+ u8 inner_proto;
+ u64 args_mob;
+ u32 teid;
+ int ihl;
+ u8 qfi;
+ const struct seg6_mobile_info *minfo = &slwt->mobile_info;
+ enum skb_drop_reason reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+
+ BUILD_BUG_ON(sizeof(struct seg6_mobile_h_gtp4_d_cb) >
+ sizeof_field(struct sk_buff, cb));
+
+ if (!pskb_may_pull(skb, sizeof(*ip4h))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ ip4h = ip_hdr(skb);
+ if (ip4h->protocol != IPPROTO_UDP) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ /* ip_rcv_core() rejects ihl < 5, but enforce it here too so the
+ * lwtunnel is self-contained against future callers that bypass
+ * the IPv4 receive entry path.
+ */
+ if (ip4h->ihl < 5) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ ihl = ip4h->ihl * 4;
+ if (!pskb_may_pull(skb, ihl + sizeof(*uh))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_GTPU;
+ goto drop;
+ }
+
+ ip4h = ip_hdr(skb);
+ uh = (struct udphdr *)((u8 *)ip4h + ihl);
+ if (uh->dest != htons(GTP1U_PORT))
+ goto drop;
+
+ /* Snapshot the outer IPv4 fields before seg6_mobile_parse_gtpu(),
+ * whose internal pskb_may_pull() calls may reallocate skb->head
+ * and invalidate ip4h.
+ */
+ v4_da = ip4h->daddr;
+ v4_sa = ip4h->saddr;
+ frag_off = ip4h->frag_off;
+ inner_dsfield = ipv4_get_dsfield(ip4h);
+
+ gtp_hdrlen = seg6_mobile_parse_gtpu(skb, ihl + sizeof(*uh),
+ &teid, &qfi);
+ if (gtp_hdrlen == -EOPNOTSUPP)
+ return seg6_mobile_passthrough_non_tpdu(skb);
+ if (gtp_hdrlen < 0)
+ goto drop;
+
+ args_mob = seg6_mobile_args_from_teid_qfi(teid, qfi);
+
+ new_da = slwt->nh6;
+ if (seg6_mobile_fill_egress_sid(&new_da, minfo->sr_prefix_len,
+ minfo->v4_mask_len, v4_da, args_mob)) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_SID;
+ goto drop;
+ }
+
+ new_sa = minfo->src_addr;
+ seg6_mobile_overlay_v4(&new_sa, minfo->v4_mask_len, minfo->v6_src_prefix_len,
+ v4_sa);
+
+ outer_len = ihl + sizeof(*uh) + gtp_hdrlen;
+ if (!pskb_may_pull(skb, outer_len + 1)) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ inner_off = outer_len;
+ inner_first = *((u8 *)skb->data + inner_off);
+ switch (inner_first >> 4) {
+ case 4:
+ inner_proto = IPPROTO_IPIP;
+ inner_nfproto = NFPROTO_IPV4;
+ break;
+ case 6:
+ inner_proto = IPPROTO_IPV6;
+ inner_nfproto = NFPROTO_IPV6;
+ break;
+ default:
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ /* PMTU: H.M.GTP4.D strips IPv4/UDP/GTP-U (>=36 B) and prepends
+ * IPv6 + (optional) SRH (40 B base + segments). Net delta can
+ * be positive (encap grows) when the SR Policy has multiple
+ * segments or when GTP-U was a short header. Reject and
+ * inform the source via ICMP_DEST_UNREACH/FRAG_NEEDED if the
+ * result would not fit.
+ */
+ {
+ unsigned int srh_len = slwt->srh ?
+ ((slwt->srh->hdrlen + 1) << 3) : 0;
+ unsigned int new_outer = sizeof(struct ipv6hdr) + srh_len;
+ unsigned int post_encap = skb->len - outer_len + new_outer;
+ unsigned int mtu = dst_mtu(skb_dst(skb));
+ /* Compute the upstream-equivalent MTU as a signed delta:
+ * IPv4 options can make outer_len > new_outer, in which
+ * case unsigned subtraction would wrap. All values fit
+ * comfortably in int (mtu <= 64K, outer_len <= ~84,
+ * new_outer <= ~2 KiB).
+ */
+ int upstream_mtu = (int)mtu + (int)outer_len - (int)new_outer;
+
+ if (mtu && post_encap > mtu) {
+ if (frag_off & htons(IP_DF)) {
+ icmp_ndo_send(skb, ICMP_DEST_UNREACH,
+ ICMP_FRAG_NEEDED,
+ htonl(upstream_mtu > 0 ?
+ upstream_mtu : 0));
+ }
+ reason = SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED;
+ goto drop;
+ }
+
+ if (skb_is_gso(skb) && mtu &&
+ (upstream_mtu <= 0 ||
+ !skb_gso_validate_network_len(skb, upstream_mtu))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_MTU_EXCEEDED;
+ goto drop;
+ }
+ }
+
+ if (!pskb_may_pull(skb, outer_len +
+ ((inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) : sizeof(struct ipv6hdr)))) {
+ reason = SKB_DROP_REASON_SEG6_MOBILE_BAD_INNER;
+ goto drop;
+ }
+
+ skb_pull_rcsum(skb, outer_len);
+ skb_reset_network_header(skb);
+
+ skb->protocol = (inner_proto == IPPROTO_IPIP) ? htons(ETH_P_IP)
+ : htons(ETH_P_IPV6);
+
+ skb_set_transport_header(skb,
+ (inner_proto == IPPROTO_IPIP) ?
+ sizeof(struct iphdr) :
+ sizeof(struct ipv6hdr));
+ nf_reset_ct(skb);
+
+ cb = SEG6_MOBILE_H_GTP4_D_CB(skb);
+ cb->new_da = new_da;
+ cb->new_sa = new_sa;
+ /* RFC 6040 normal-mode propagation: copy the outer IPv4 (incoming
+ * GTP-U envelope) DSCP+ECN verbatim into the new outer IPv6
+ * Traffic Class.
+ */
+ cb->outer_tclass = inner_dsfield;
+
+ if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled))
+ return NF_HOOK(inner_nfproto, NF_INET_PRE_ROUTING,
+ dev_net(skb->dev), NULL, skb, skb->dev,
+ NULL, input_action_h_m_gtp4_d_finish);
+
+ return input_action_h_m_gtp4_d_finish(dev_net(skb->dev), NULL, skb);
+
+drop:
+ kfree_skb_reason(skb, reason);
+ return -EINVAL;
+}
+
/* RFC 9433 Section 6.5 -- End.M.GTP6.E
* Receives an SRv6 packet whose current SID is an End.M.GTP6.E SID
* (Segments Left == 1) and re-encapsulates the inner payload in
@@ -3135,6 +3477,22 @@ static struct seg6_action_desc seg6_action_table[] = {
.destroy_state = seg6_end_m_gtp6_d_aug_destroy,
},
},
+ {
+ .action = SEG6_LOCAL_ACTION_H_M_GTP4_D,
+ .input_family = AF_INET,
+ .attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SRC_ADDR) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V4_MASK_LEN) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_SR_PREFIX_LEN),
+ .optattrs = SEG6_F_LOCAL_COUNTERS |
+ SEG6_F_ATTR(SEG6_LOCAL_SRH) |
+ SEG6_F_ATTR(SEG6_LOCAL_MOBILE_V6_SRC_PREFIX_LEN) |
+ SEG6_F_ATTR(SEG6_LOCAL_OIF),
+ .input = input_action_h_m_gtp4_d,
+ .slwt_ops = {
+ .build_state = seg6_mobile_v4_validate,
+ },
+ },
{
.action = SEG6_LOCAL_ACTION_END_MAP,
.attrs = SEG6_F_ATTR(SEG6_LOCAL_NH6),
@@ -3206,13 +3564,22 @@ static int seg6_local_input_core(struct net *net, struct sock *sk,
static int seg6_local_input(struct sk_buff *skb)
{
- if (skb->protocol != htons(ETH_P_IPV6)) {
+ int nfproto;
+
+ switch (skb->protocol) {
+ case htons(ETH_P_IPV6):
+ nfproto = NFPROTO_IPV6;
+ break;
+ case htons(ETH_P_IP):
+ nfproto = NFPROTO_IPV4;
+ break;
+ default:
kfree_skb(skb);
return -EINVAL;
}
if (static_branch_unlikely(&nf_hooks_lwtunnel_enabled))
- return NF_HOOK(NFPROTO_IPV6, NF_INET_LOCAL_IN,
+ return NF_HOOK(nfproto, NF_INET_LOCAL_IN,
dev_net(skb->dev), NULL, skb, skb->dev, NULL,
seg6_local_input_core);
@@ -3625,6 +3992,44 @@ static int seg6_mobile_v4_validate(struct seg6_local_lwt *slwt,
"SRv6 Mobile v6_src_prefix_len must leave room for the 32-bit IPv4 source template (prefix_len <= 96)");
return -EINVAL;
}
+
+ /* H.M.GTP4.D constructs an End.M.GTP4.E SID at egress time whose
+ * layout (RFC 9433 Section 6.7 / 6.6 Figure 9) is
+ * locator (sr_prefix_len) | IPv4 DA (v4_mask_len) | Args.Mob.Session (40)
+ * so the three lengths together must fit in 128 bits. End.M.GTP4.E
+ * leaves sr_prefix_len at 0 (the attribute is not in its action_table
+ * entry), so this check is a no-op for End.M.GTP4.E.
+ */
+ if (minfo->sr_prefix_len &&
+ (unsigned int)minfo->sr_prefix_len + (unsigned int)minfo->v4_mask_len +
+ SEG6_MOBILE_ARGS_MOB_LEN > 128) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "SRv6 Mobile sr_prefix_len + v4_mask_len + 40 (Args.Mob.Session) must not exceed 128");
+ return -EINVAL;
+ }
+
+ /* End.M.GTP4.E SID layout (RFC 9433 Section 6.6 Figure 10):
+ * locator (route prefix) | IPv4 DA (v4_mask_len) | Args.Mob.Session (40) | pad
+ *
+ * The locator length comes from the IPv6 route's destination prefix
+ * length, not from sr_prefix_len. Only End.M.GTP4.E (AF_INET6 route)
+ * needs this check; H.M.GTP4.D requires sr_prefix_len so this branch
+ * is unreachable for it. Gate on input_family so the @cfg cast to
+ * struct fib6_config * is type-correct.
+ */
+ if (!minfo->sr_prefix_len &&
+ (slwt->desc->input_family ? : AF_INET6) == AF_INET6) {
+ const struct fib6_config *fib6_cfg = cfg;
+
+ if ((unsigned int)fib6_cfg->fc_dst_len +
+ (unsigned int)minfo->v4_mask_len +
+ SEG6_MOBILE_ARGS_MOB_LEN > 128) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "End.M.GTP4.E route prefix length + v4_mask_len + 40 (Args.Mob.Session) must not exceed 128");
+ return -EINVAL;
+ }
+ }
+
return 0;
}
@@ -4323,7 +4728,7 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
struct seg6_local_lwt *slwt;
int err;
- if (family != AF_INET6)
+ if (family != AF_INET6 && family != AF_INET)
return -EINVAL;
err = nla_parse_nested_deprecated(tb, SEG6_LOCAL_MAX, nla,
@@ -4346,6 +4751,17 @@ static int seg6_local_build_state(struct net *net, struct nlattr *nla,
if (err < 0)
goto out_free;
+ /* Reject behaviors that are not registered for the route family
+ * the lwtunnel is being installed on. input_family defaults to
+ * AF_INET6; H.M.GTP4.D is the only AF_INET behavior.
+ */
+ if ((slwt->desc->input_family ? : AF_INET6) != family) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "seg6local action does not support this address family");
+ err = -EINVAL;
+ goto out_destroy_attrs;
+ }
+
err = seg6_local_lwtunnel_build_state(slwt, cfg, extack);
if (err < 0)
goto out_destroy_attrs;
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index a770e711652e..30adf2474c45 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -97,6 +97,7 @@ TEST_PROGS := \
srv6_end_map_test.sh \
srv6_end_next_csid_l3vpn_test.sh \
srv6_end_x_next_csid_l3vpn_test.sh \
+ srv6_h_m_gtp4_d_test.sh \
srv6_hencap_red_l3vpn_test.sh \
srv6_hl2encap_red_l2vpn_test.sh \
srv6_iptunnel_cache.sh \
diff --git a/tools/testing/selftests/net/srv6_h_m_gtp4_d_test.sh b/tools/testing/selftests/net/srv6_h_m_gtp4_d_test.sh
new file mode 100755
index 000000000000..3e5f4f74c656
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_h_m_gtp4_d_test.sh
@@ -0,0 +1,487 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# shellcheck disable=SC2034,SC2154
+#
+# Selftest for the SRv6 H.M.GTP4.D behavior (RFC 9433 Section 6.7).
+#
+# +-------+ 10.0.0.0/24 +-------+ 2001:db8:2::/64 +-------+
+# | gnb | ------------------- | srgw | ------------------- | srupf |
+# +-------+ veth-n3 +-------+ veth-n9 +-------+
+# |
+# | 10.10.0.0/24
+# +--------veth-n6--------- +-------+
+# | lupf |
+# +-------+
+#
+# gnb is the GTP-U-side test peer that injects the GTP-U packets.
+# srupf is the SR-domain-side SRv6-aware UPF (RFC 9433 sense, not
+# a 3GPP UPF) that receives the resulting SRv6 T-PDU. lupf is the
+# SRv6-non-aware legacy UPF that owns the GTP-U control plane and
+# receives non-T-PDU GTP-U (Echo Request, Error Indication, ...)
+# forwarded by srgw via the H.M.GTP4.D route's dev. srgw runs the
+# H.M.GTP4.D behavior under test.
+#
+# An H.M.GTP4.D SID is installed on the SR ingress for IPv4 destination
+# 10.99.0.0/24 with v4_mask_len=32 and sr_prefix_len=32; Args.Mob.Session is
+# the fixed 40-bit field defined by RFC 9433 Section 6.1, Figure 8. The
+# H.M.GTP4.D SID locator prefix is 2001:db8::, so an inbound IPv4/UDP/GTP-U
+# packet to 10.99.0.2 with TEID 0x123 (and PDU Session ext carrying QFI=5) is
+# expected to come out as IPv6 toward 2001:db8:a63:2:1400:1:2300:0,
+# where:
+#
+# bytes 0-3 (locator /32) = 20 01 0d b8
+# bytes 4-7 (IPv4 DA, 32-bit) = 0a 63 00 02 (= 10.99.0.2)
+# bytes 8-12 (Args.Mob.Session) = 14 00 00 01 23
+# (QFI byte 0x14 + 32-bit PDU/TEID 0x123)
+# bytes 13-15 (SID padding) = 00 00 00
+
+source lib.sh
+
+readonly TIMEOUT=4
+
+tcpdump_pid=""
+have_vrf=0
+
+cleanup()
+{
+ if [ -n "$tcpdump_pid" ]; then
+ kill "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ fi
+ cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup()
+{
+ setup_ns gnb srgw srupf lupf srupf_vrf
+
+ ip -n "$gnb" link set lo up
+ ip -n "$srgw" link set lo up
+ ip -n "$srupf" link set lo up
+ ip -n "$lupf" link set lo up
+ ip -n "$srupf_vrf" link set lo up
+
+ ip link add veth-n3 netns "$gnb" type veth peer name veth-n3-srgw \
+ netns "$srgw"
+ ip -n "$gnb" addr add 10.0.0.2/24 dev veth-n3
+ ip -n "$srgw" addr add 10.0.0.1/24 dev veth-n3-srgw
+ ip -n "$gnb" link set veth-n3 up
+ ip -n "$srgw" link set veth-n3-srgw up
+
+ ip link add veth-n9 netns "$srgw" type veth peer name veth-n9-srupf \
+ netns "$srupf"
+ ip -n "$srgw" addr add 2001:db8:2::1/64 dev veth-n9 nodad
+ ip -n "$srupf" addr add 2001:db8:2::e/64 dev veth-n9-srupf nodad
+ ip -n "$srgw" link set veth-n9 up
+ ip -n "$srupf" link set veth-n9-srupf up
+
+ # Legacy IPv4 UPF reachable from srgw; non-T-PDU GTP-U is forwarded
+ # here via the H.M.GTP4.D route's dev so the legacy GTP-U control
+ # plane (Echo Request / Response) can be answered downstream.
+ ip link add veth-n6 netns "$srgw" type veth peer name veth-n6-lupf \
+ netns "$lupf"
+ ip -n "$srgw" addr add 10.10.0.1/24 dev veth-n6
+ ip -n "$lupf" addr add 10.10.0.2/24 dev veth-n6-lupf
+ ip -n "$srgw" link set veth-n6 up
+ ip -n "$lupf" link set veth-n6-lupf up
+
+ ip netns exec "$srgw" sysctl -wq net.ipv4.ip_forward=1
+ ip netns exec "$srgw" sysctl -wq net.ipv6.conf.all.forwarding=1
+
+ ip -n "$gnb" route add 10.99.0.0/24 via 10.0.0.1
+
+ # Install H.M.GTP4.D on an IPv4 route. sr_prefix_len declares the
+ # locator length used by the remote End.M.GTP4.E SID. dev veth-n6
+ # is the legacy UPF leg: T-PDU encap takes the IPv6 SR Policy path
+ # (independent of dst.dev) while non-T-PDU is forwarded out veth-n6
+ # via ip_forward.
+ ip -n "$srgw" -4 route add 10.99.0.0/24 \
+ encap seg6local action H.M.GTP4.D \
+ nh6 2001:db8:: \
+ src 2001:db8:2::1 \
+ v4_mask_len 32 sr_prefix_len 32 \
+ dev veth-n6
+
+ # srgw needs to reach the constructed SID; the /32 prefix covers
+ # any IPv4 DA + Args.Mob.Session combination derived from the
+ # locator 2001:db8::.
+ ip -n "$srgw" -6 route add 2001:db8::/32 \
+ via 2001:db8:2::e dev veth-n9
+ ip -n "$srupf" -6 route add 2001:db8::/32 dev veth-n9-srupf
+
+ local upf_mac
+ upf_mac=$(ip -n "$srupf" -j link show veth-n9-srupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:2::e dev veth-n9 \
+ lladdr "$upf_mac" nud permanent 2>/dev/null || true
+
+ # Pre-resolve the IPv4 ARP entry for the SID-prefix DA so non-T-PDU
+ # Echo can be forwarded to lupf without ARP delay.
+ local lupf_mac
+ lupf_mac=$(ip -n "$lupf" -j link show veth-n6-lupf | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" neigh replace 10.99.0.2 dev veth-n6 \
+ lladdr "$lupf_mac" nud permanent 2>/dev/null || true
+
+ # Per-route VRF case: a second SR-side upf in its own VRF. The
+ # H.M.GTP4.D SID for this tenant binds the SRv6 underlay output to
+ # the VRF via 'oif'. Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ modprobe vrf 2>/dev/null
+ if ip -n "$srgw" link add vrf-n9 type vrf table 100 2>/dev/null; then
+ have_vrf=1
+ ip -n "$srgw" link set dev vrf-n9 up
+
+ ip link add veth-n9-2 netns "$srgw" type veth peer name \
+ veth-n9-2-srupf netns "$srupf_vrf"
+ ip -n "$srgw" link set dev veth-n9-2 master vrf-n9
+ ip -n "$srgw" addr add 2001:db8:4::1/64 dev veth-n9-2 nodad
+ ip -n "$srupf_vrf" addr add 2001:db8:4::e/64 dev veth-n9-2-srupf \
+ nodad
+ ip -n "$srgw" link set dev veth-n9-2 up
+ ip -n "$srupf_vrf" link set dev veth-n9-2-srupf up
+
+ # H.M.GTP4.D for a second IPv4 prefix bound to vrf-n9; the
+ # constructed SID's locator is 2001:db9::/32 (a separate locator
+ # so the two routes never collide).
+ ip -n "$srgw" -4 route add 10.99.1.0/24 \
+ encap seg6local action H.M.GTP4.D \
+ nh6 2001:db9:: \
+ src 2001:db8:2::1 \
+ v4_mask_len 32 sr_prefix_len 32 \
+ oif vrf-n9 \
+ dev veth-n9-2
+
+ # Reach the constructed SID via the VRF table.
+ ip -n "$srgw" -6 route add 2001:db9::/32 \
+ via 2001:db8:4::e dev veth-n9-2 vrf vrf-n9
+ ip -n "$srupf_vrf" -6 route add 2001:db9::/32 \
+ dev veth-n9-2-srupf
+
+ local upf_vrf_mac
+ upf_vrf_mac=$(ip -n "$srupf_vrf" -j link show \
+ veth-n9-2-srupf | python3 -c \
+ 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+ ip -n "$srgw" -6 neigh replace 2001:db8:4::e dev veth-n9-2 \
+ lladdr "$upf_vrf_mac" nud permanent 2>/dev/null || true
+
+ ip -n "$gnb" route add 10.99.1.0/24 via 10.0.0.1
+ fi
+}
+
+check_dependencies()
+{
+ if ! command -v tcpdump >/dev/null; then
+ echo "SKIP: tcpdump is required"; exit "$ksft_skip"
+ fi
+ if ! command -v python3 >/dev/null; then
+ echo "SKIP: python3 is required"; exit "$ksft_skip"
+ fi
+ if ! python3 -c "import scapy.all" 2>/dev/null; then
+ echo "SKIP: python3-scapy is required"; exit "$ksft_skip"
+ fi
+
+ if ! ip route help 2>&1 | grep -qF "H.M.GTP4.D"; then
+ echo "SKIP: iproute2 too old, missing seg6local action H.M.GTP4.D"
+ exit "$ksft_skip"
+ fi
+}
+
+send_gtpu()
+{
+ local v4_dst="$1"
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" V4_DST="$v4_dst" ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IP, UDP, ICMP, sendp, Ether
+mac = os.environ['SRGW_MAC']
+v4_dst = os.environ['V4_DST']
+gtpu = bytes.fromhex(
+ "34 ff 00 24 00 00 01 23 00 00 00 85"
+ "01 00 05 00")
+inner = bytes(IP(src='10.0.0.2', dst=v4_dst) / ICMP())
+pkt = (Ether(dst=mac) /
+ IP(src='10.0.0.2', dst=v4_dst) /
+ UDP(sport=2152, dport=2152) /
+ (gtpu + inner))
+sendp(pkt, iface='veth-n3', verbose=False)
+PY
+}
+
+# Send a GTPv1-U Echo Request; H.M.GTP4.D must NOT consume it but
+# pass it through to the configured forwarding path so the legacy UPF
+# (which owns the GTP-U control plane) can answer. Verified by
+# capturing the unaltered Echo Request (type 0x01) on the lupf side.
+send_gtpu_echo()
+{
+ local v4_dst="$1"
+ local srgw_mac
+
+ srgw_mac=$(ip -n "$srgw" -j link show veth-n3-srgw | \
+ python3 -c 'import sys, json; print(json.load(sys.stdin)[0]["address"])')
+
+ SRGW_MAC="$srgw_mac" V4_DST="$v4_dst" ip netns exec "$gnb" python3 - <<'PY'
+import os
+from scapy.all import IP, UDP, sendp, Ether
+mac = os.environ['SRGW_MAC']
+v4_dst = os.environ['V4_DST']
+gtpu_echo = bytes.fromhex("32 01 00 04 00 00 00 00 42 42 00 00")
+pkt = (Ether(dst=mac) /
+ IP(src='10.0.0.2', dst=v4_dst) /
+ UDP(sport=2152, dport=2152) /
+ gtpu_echo)
+sendp(pkt, iface='veth-n3', verbose=False)
+PY
+}
+
+run_echo_test()
+{
+ local v4_dst="$1"
+ local out
+ local rc
+
+ out=$(mktemp)
+
+ ip netns exec "$lupf" tcpdump -U -nni veth-n6-lupf -w "$out" \
+ 'udp port 2152' 2>/dev/null &
+ tcpdump_pid=$!
+ sleep 1
+
+ send_gtpu_echo "$v4_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+
+ V4_DST="$v4_dst" python3 - "$out" <<'PYEOF'
+import os, sys
+from scapy.all import rdpcap, IP, UDP
+
+want_dst = os.environ['V4_DST']
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IP not in p or UDP not in p:
+ continue
+ if p[UDP].sport != 2152 or p[UDP].dport != 2152:
+ continue
+ if p[IP].dst != want_dst:
+ continue
+ payload = bytes(p[UDP].payload)
+ if len(payload) >= 2 and payload[1] == 0x01:
+ sys.exit(0)
+sys.exit("no GTPv1-U Echo Request observed at lupf "
+ "(H.M.GTP4.D failed to pass non-T-PDU through)")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+capture_traffic()
+{
+ local capture_ns="$1"
+ local capture_iface="$2"
+ local v4_dst="$3"
+ local out="$4"
+
+ ip netns exec "$capture_ns" tcpdump -U -nni "$capture_iface" -w "$out" \
+ 'ip6' 2>/dev/null &
+ tcpdump_pid=$!
+ # Give tcpdump a brief moment to attach the BPF filter.
+ sleep 1
+
+ send_gtpu "$v4_dst"
+
+ sleep 1
+ kill -INT "$tcpdump_pid" 2>/dev/null
+ wait "$tcpdump_pid" 2>/dev/null
+ tcpdump_pid=""
+}
+
+run_test()
+{
+ local v4_dst="$1" # inner IPv4 DA fed into the gNB
+ local locator_octets="$2" # "20 01 0d b8"
+ local v4_dst_octets="$3" # "0a 63 00 02" (10.99.0.2) etc
+ local sa_pos="$4" # byte offset of expected IPv4 SA in IPv6 SA
+ local capture_ns="${5:-$srupf}"
+ local capture_iface="${6:-veth-n9-srupf}"
+ local out
+ local rc
+
+ out=$(mktemp)
+ capture_traffic "$capture_ns" "$capture_iface" "$v4_dst" "$out"
+
+ # scapy field check: an IPv6 packet must reach upf with:
+ # - DST address whose bytes 0..3 = locator, bytes 4..7 = original
+ # IPv4 DA, bytes 8..12 = 40-bit Args.Mob.Session
+ # (0x14 = QFI=5, then TEID 0x00000123), bytes 13..15 = padding.
+ # - SRC address whose bytes [sa_pos..sa_pos+4) = original IPv4 SA
+ # (10.0.0.2) per RFC 9433 Section 6.6 Figure 10.
+ LOC="$locator_octets" V4="$v4_dst_octets" SA_POS="$sa_pos" \
+ python3 - "$out" <<'PYEOF'
+import ipaddress
+import os
+import sys
+from scapy.all import rdpcap, IPv6
+
+loc = bytes.fromhex(os.environ['LOC'])
+v4_dst = bytes.fromhex(os.environ['V4'])
+sa_pos = int(os.environ['SA_POS'])
+expected_v4_sa = bytes.fromhex('0a 00 00 02')
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 not in p:
+ continue
+ da = ipaddress.IPv6Address(str(p[IPv6].dst)).packed
+ sa = ipaddress.IPv6Address(str(p[IPv6].src)).packed
+ if da[0:4] != loc:
+ continue
+ if da[4:8] != v4_dst:
+ sys.exit(f"unexpected SID v4-DA slice {da[4:8].hex()}, want {v4_dst.hex()}")
+ if da[8:13] != bytes.fromhex("1400000123"):
+ sys.exit(f"unexpected Args.Mob.Session {da[8:13].hex()}")
+ if sa[sa_pos:sa_pos + 4] != expected_v4_sa:
+ sys.exit(f"unexpected IPv4 SA at byte {sa_pos}: "
+ f"{sa[sa_pos:sa_pos + 4].hex()}, want {expected_v4_sa.hex()}")
+ sys.exit(0)
+sys.exit("no IPv6 packet matching the expected SID locator")
+PYEOF
+ rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+# Verify that nf_hooks_lwtunnel=1 makes the inner T-PDU 5-tuple
+# visible to nftables on the SR Gateway. The inner is IPv4
+# (10.0.0.2 -> v4_dst, set by send_gtpu()); the nft rule matches on
+# the inner IPv4 source. DROP must suppress the SRv6 packet at the
+# upf, ACCEPT must let it through.
+run_nf_test()
+{
+ local verdict="$1" # drop | accept
+ local expect="$2" # 1 if SRv6 expected, empty otherwise
+ local v4_dst="10.99.0.2"
+ local out
+
+ ip netns exec "$srgw" nft flush chain ip filter prerouting
+ ip netns exec "$srgw" nft add rule ip filter prerouting \
+ ip saddr 10.0.0.2 "$verdict"
+
+ out=$(mktemp)
+ capture_traffic "$srupf" "veth-n9-srupf" "$v4_dst" "$out"
+
+ if [ -n "$expect" ]; then
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IPv6
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p:
+ sys.exit(0)
+sys.exit("expected SRv6 packet not observed at upf despite nft accept")
+PYEOF
+ else
+ python3 - "$out" <<'PYEOF'
+import sys
+from scapy.all import rdpcap, IPv6
+
+pkts = rdpcap(sys.argv[1])
+for p in pkts:
+ if IPv6 in p and bytes(p[IPv6])[6] == 0x29:
+ # nexthdr == IPIP (41) means an SRv6-encapped IPIP packet
+ sys.exit("SRv6 packet leaked to upf despite nft drop on inner")
+ if IPv6 in p and bytes(p[IPv6])[6] == 0x2b:
+ # nexthdr == 43 (Routing) means SRH present
+ sys.exit("SRv6 packet leaked to upf despite nft drop on inner")
+sys.exit(0)
+PYEOF
+ fi
+ local rc=$?
+ rm -f "$out"
+ return $rc
+}
+
+main()
+{
+ local rc=0
+
+ check_dependencies
+ setup
+
+ # Hard-coded /64 layout: IPv4 SA at IPv6 bytes 8..11.
+ if run_test "10.99.0.2" "20 01 0d b8" "0a 63 00 02" 8; then
+ echo "TEST: H.M.GTP4.D (default) [PASS]"
+ else
+ echo "TEST: H.M.GTP4.D (default) [FAIL]"
+ rc=1
+ fi
+
+ if run_echo_test "10.99.0.2"; then
+ echo "TEST: H.M.GTP4.D (non-T-PDU passthrough) [PASS]"
+ else
+ echo "TEST: H.M.GTP4.D (non-T-PDU passthrough) [FAIL]"
+ rc=1
+ fi
+
+ # VRF binding: SRv6 underlay output goes through vrf-n9 (table 100).
+ # Reported as [SKIP] when CONFIG_NET_VRF is not loaded.
+ if [ "$have_vrf" = "1" ]; then
+ # Locator 2001:db9::/32 -> "20 01 0d b9", v4 dst 10.99.1.2 ->
+ # "0a 63 01 02".
+ if run_test "10.99.1.2" "20 01 0d b9" "0a 63 01 02" 8 \
+ "$srupf_vrf" "veth-n9-2-srupf"; then
+ echo "TEST: H.M.GTP4.D (oif vrf-n9) [PASS]"
+ else
+ echo "TEST: H.M.GTP4.D (oif vrf-n9) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: H.M.GTP4.D (oif vrf-n9) [SKIP] (CONFIG_NET_VRF not loaded)"
+ fi
+
+ # Inner T-PDU netfilter hook: only meaningful when nft is present
+ # and the kernel exposes net.netfilter.nf_hooks_lwtunnel.
+ if command -v nft >/dev/null && \
+ ip netns exec "$srgw" sysctl -wq \
+ net.netfilter.nf_hooks_lwtunnel=1 2>/dev/null; then
+ ip netns exec "$srgw" nft add table ip filter
+ ip netns exec "$srgw" nft 'add chain ip filter prerouting' \
+ '{ type filter hook prerouting priority 0; }'
+
+ if run_nf_test drop ""; then
+ echo "TEST: H.M.GTP4.D (nft drop on inner) [PASS]"
+ else
+ echo "TEST: H.M.GTP4.D (nft drop on inner) [FAIL]"
+ rc=1
+ fi
+
+ if run_nf_test accept "1"; then
+ echo "TEST: H.M.GTP4.D (nft accept on inner) [PASS]"
+ else
+ echo "TEST: H.M.GTP4.D (nft accept on inner) [FAIL]"
+ rc=1
+ fi
+ else
+ echo "TEST: H.M.GTP4.D (inner-flow netfilter hook) [SKIP]" \
+ "(nft or nf_hooks_lwtunnel unavailable)"
+ fi
+
+ if [ "$rc" -eq 0 ]; then
+ echo "TEST: H.M.GTP4.D [PASS]"
+ exit "$ksft_pass"
+ else
+ echo "TEST: H.M.GTP4.D [FAIL]"
+ exit "$ksft_fail"
+ fi
+}
+
+main "$@"
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 7/7] Documentation: networking: add seg6_mobile guide
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
` (5 preceding siblings ...)
2026-05-04 16:30 ` [PATCH v2 6/7] seg6: add H.M.GTP4.D behavior Yuya Kusakabe
@ 2026-05-04 16:30 ` Yuya Kusakabe
2026-05-04 23:39 ` [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Jakub Kicinski
7 siblings, 0 replies; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-04 16:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Andrea Mayer, Shuah Khan, Jonathan Corbet,
Shuah Khan
Cc: linux-kernel, netdev, linux-kselftest, linux-doc, Yuya Kusakabe
Document the six RFC 9433 Mobile User Plane behaviors implemented
by seg6_local, the SID layout used by the GTP behaviors, security
considerations (HMAC, SR-domain perimeter filtering), netfilter
integration with nf_hooks_lwtunnel, and the location of the
selftests.
Link: https://www.rfc-editor.org/rfc/rfc9433
Signed-off-by: Yuya Kusakabe <yuya.kusakabe@gmail.com>
---
Documentation/networking/index.rst | 1 +
Documentation/networking/seg6_mobile.rst | 236 +++++++++++++++++++++++++++++++
2 files changed, 237 insertions(+)
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 44a422ad3b05..90fa0ad223da 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -108,6 +108,7 @@ Contents:
sctp
secid
seg6-sysctl
+ seg6_mobile
skbuff
smc-sysctl
sriov
diff --git a/Documentation/networking/seg6_mobile.rst b/Documentation/networking/seg6_mobile.rst
new file mode 100644
index 000000000000..6a268bedf3be
--- /dev/null
+++ b/Documentation/networking/seg6_mobile.rst
@@ -0,0 +1,236 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================================
+SRv6 Mobile User Plane (RFC 9433)
+=================================
+
+This document describes the SRv6 Mobile User Plane (MUP) behaviors
+implemented by the ``seg6_local`` lightweight tunnel. Six of the
+seven behaviors defined in `RFC 9433`_ are supported and configurable
+through ``ip route ... encap seg6local action ...``: End.MAP,
+End.M.GTP6.D, End.M.GTP6.D.Di, End.M.GTP6.E, End.M.GTP4.E, and
+H.M.GTP4.D.
+
+End.Limit (RFC 9433 Section 6.8) is unimplemented.
+
+.. _`RFC 9433`: https://www.rfc-editor.org/rfc/rfc9433
+
+Behaviors
+=========
+
+End.MAP (`RFC 9433`_ Section 6.2)
+---------------------------------
+
+Endpoint with SID mapping. Replaces the IPv6 destination address with
+the next SID; the SRH is left untouched. Standard SRv6 endpoint hop
+limit handling applies (an ICMP Time Exceeded is emitted when the IPv6
+Hop Limit would reach zero per RFC 9433 Section 6.2 S01-S03; the Hop
+Limit is decremented per S04 before forwarding). ``nh6`` selects the
+replacement SID::
+
+ ip -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.MAP nh6 2001:db8:2::e \
+ dev <dev>
+
+End.M.GTP6.D (`RFC 9433`_ Section 6.3)
+--------------------------------------
+
+Encapsulation endpoint that consumes IPv6/UDP/GTP-U and emits SRv6.
+The new SRH is built from the configured segment list, with the
+original outer IPv6 destination ``D`` of the inbound GTP-U packet
+stamped at SRH ``segments[0]`` (the ultimate destination of the SR
+Policy). The configured ``srh segs`` last entry is the remote
+End.M.GTP6.E SID and lands at SRH ``segments[1]``, the penultimate
+position required by RFC 9433 Section 6.5 Note; the kernel encodes
+``Args.Mob.Session`` into the locator-relative slice of that SID so
+that the egress End.M.GTP6.E peer can recover ``D`` (and hence the
+original gNB-side GTP-U destination) from ``segments[0]`` after its
+SRv6 strip::
+
+ ip -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.M.GTP6.D \
+ srh segs 2001:db8:2::e \
+ src 2001:db8:2::1 \
+ sr_prefix_len 64 \
+ dev <dev>
+
+``sr_prefix_len`` declares the locator length used by the remote
+End.M.GTP6.E SID and must match the prefix length configured on that
+remote endpoint; the SR Gateway has no way to discover the remote
+SID's locator length on its own.
+
+The wire SRH for the example above is
+``[D, 2001:db8:2::e | Args.Mob.Session]``: ``segments[0]`` is the
+saved original outer IPv6 DA of the GTP-U packet, and
+``segments[1]`` is the End.M.GTP6.E SID at the egress UPF in the
+penultimate position required by RFC 9433 Section 6.5.
+
+End.M.GTP6.D.Di (`RFC 9433`_ Section 6.4)
+-----------------------------------------
+
+Drop-in variant of End.M.GTP6.D that preserves the original IPv6
+destination address as ``segments[0]`` (the last in-transit SID in the
+new SRH) and discards TEID/QFI rather than folding them into
+Args.Mob.Session. Useful when the upstream service expects the
+original destination to survive untouched::
+
+ ip -6 route add 2001:db8:f::/64 \
+ encap seg6local action End.M.GTP6.D.Di \
+ srh segs 2001:db8:2::e,2001:db8:3::e \
+ src 2001:db8:2::1 \
+ dev <dev>
+
+End.M.GTP6.E (`RFC 9433`_ Section 6.5)
+--------------------------------------
+
+Egress endpoint that decapsulates SRv6 and emits IPv6/UDP/GTP-U. The
+active SID carries the 40-bit ``Args.Mob.Session`` field defined in
+RFC 9433 Section 6.1 immediately after the locator; TEID and QFI are
+extracted from it. The route prefix length implicitly declares the
+locator length on this end of the tunnel; no explicit
+``sr_prefix_len`` is required because the SID is locally instantiated
+by this route::
+
+ ip -6 route add 2001:db8:e::/64 \
+ encap seg6local action End.M.GTP6.E src 2001:db8:2::1 \
+ dev <dev>
+
+The route prefix length must leave room for the 40-bit
+``Args.Mob.Session`` that immediately follows the locator, so the
+constraint ``prefix_len + 40 <= 128`` (i.e. ``prefix_len <= 88``)
+is enforced at install time.
+
+The optional ``pdu_type {dl|ul|<num>}`` attribute supplies the PDU
+Type field (3GPP TS 38.415 Section 5.5.2) of the GTP-U PDU Session
+Container. When set, every emitted packet carries the container with
+that PDU Type and the QFI from ``Args.Mob.Session``; when unset the
+kernel emits a short GTPv1-U header with no container, regardless of
+the QFI. ``pdu_type`` MUST be set on routes serving 5G N3 traffic;
+omitting it targets LTE-only / S1-U deployments. Numeric ``<num>``
+in 0..15 is also accepted (per TS 38.415 the field is 4 bits wide;
+2..15 are currently reserved).
+
+End.M.GTP4.E (`RFC 9433`_ Section 6.6)
+--------------------------------------
+
+Egress endpoint that decapsulates SRv6 and emits IPv4/UDP/GTP-U.
+The SID encodes the IPv4 destination per RFC 9433 Section 6.6
+Figure 9. ``v4_mask_len`` declares the width of the IPv4 DA slice
+that immediately follows the locator (in 1..32, the constraint
+``locator + v4_mask_len + 40 <= 128`` is enforced at install time)::
+
+ ip -6 route add 2001:db8::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db8:2::1 v4_mask_len 32 \
+ dev <dev>
+
+The IPv6 source address carries the IPv4 SA per RFC 9433 Section 6.6
+Figure 10. ``v6_src_prefix_len`` declares the Source UPF Prefix
+length P in bits (1..127, default 64); the IPv4 SA slice is then
+``v4_mask_len`` bits wide starting at bit offset P. The kernel
+always reads a 32-bit window from the configured ``src`` template at
+offset P (the upper ``v4_mask_len`` bits are overlaid with the
+recovered IPv4 SA), so the constraint ``v6_src_prefix_len <= 96``
+(equivalently ``v6_src_prefix_len + 32 <= 128``) is enforced at
+install time. Bits outside the IPv4 SA slice are taken verbatim
+from the configured ``src`` template::
+
+ ip -6 route add 2001:db8::/32 \
+ encap seg6local action End.M.GTP4.E \
+ src 2001:db8:2::1 v4_mask_len 32 v6_src_prefix_len 64 \
+ dev <dev>
+
+``pdu_type`` takes the same values and has the same effect as on
+End.M.GTP6.E (see above).
+
+H.M.GTP4.D (`RFC 9433`_ Section 6.7)
+------------------------------------
+
+Headend behavior that consumes IPv4/UDP/GTP-U and emits IPv6 with
+the SID encoding the original IPv4 destination plus Args.Mob.Session
+per RFC 9433 Section 6.7 Figure 11::
+
+ ip -4 route add 10.99.0.0/24 \
+ encap seg6local action H.M.GTP4.D \
+ nh6 2001:db8:: \
+ src 2001:db8:2::1 \
+ v4_mask_len 32 sr_prefix_len 32 \
+ dev <dev>
+
+The inbound IPv4 SA is encoded into the IPv6 SA using the same
+Figure 10 layout as End.M.GTP4.E (controlled by ``v6_src_prefix_len``,
+default 64).
+
+Per-route VRF / interface binding
+=================================
+
+The five GTP-related behaviors (End.M.GTP4.E, End.M.GTP6.E,
+End.M.GTP6.D, End.M.GTP6.D.Di and H.M.GTP4.D) accept the standard
+``oif`` ``seg6_local`` attribute to bind their egress lookup to a
+specific output interface or VRF device. This lets operators keep
+the SRv6 underlay, the N3 reference point (toward gNB) and the N6
+reference point (toward the data network) on separate routing tables
+or VLAN sub-interfaces, which matches typical multi-tenant
+deployments::
+
+ ip -6 route add 2001:db8:e::/64 \
+ encap seg6local action End.M.GTP6.E \
+ src 2001:db8:2::1 \
+ oif vrf-n3 \
+ dev <dev>
+
+Without ``oif`` the egress lookup uses the default routing table.
+
+Netfilter integration
+=====================
+
+The five GTP-related behaviors expose IPv4 / IPv6 inner T-PDUs to
+``NF_INET_PRE_ROUTING`` between the outer strip and the new outer
+push (mirroring ``End.DX4`` / ``End.DX6``), so iptables / nftables /
+conntrack can apply policy on the inner 5-tuple before
+re-encapsulation. Non-IP inner payloads bypass the hook and are
+re-encapsulated unchanged.
+
+Enable with::
+
+ sysctl -w net.netfilter.nf_hooks_lwtunnel=1
+
+This sysctl is one-way: it cannot be cleared without reloading the
+kernel (see :doc:`nf_conntrack-sysctl`).
+
+Example: drop traffic from a UE address on an ``End.M.GTP6.D`` SR
+Gateway::
+
+ nft add table ip filter
+ nft 'add chain ip filter prerouting \
+ { type filter hook prerouting priority 0; }'
+ nft add rule ip filter prerouting \
+ ip saddr 10.0.0.42 counter drop
+
+GTP-U non-T-PDU handling
+========================
+
+The three GTP-U decap behaviors (``End.M.GTP6.D``,
+``End.M.GTP6.D.Di`` and ``H.M.GTP4.D``) only encapsulate ``T-PDU``
+frames (3GPP TS 29.281 Section 5.1, message type 255) into SRv6.
+Any other GTP-U message (``Echo Request``/``Echo Response``,
+``Error Indication``, ``Supported Extension Headers Notification``,
+...) is forwarded unchanged toward its IPv4 / IPv6 destination as
+a regular packet, so a downstream UPF that owns the GTP-U control
+plane can process it.
+
+The downstream UPF must therefore live behind the SR Gateway, on
+the IPv4 / IPv6 network the GTP-U destination is reachable through.
+Assigning the UPF's address to a local interface on the SR Gateway
+is not a supported topology: the SR Gateway cannot at the same
+time intercept ``T-PDU`` for SRv6 encapsulation and deliver
+non-T-PDU GTP-U to a local userspace socket addressed to the same
+destination.
+
+Debugging
+=========
+
+The SRv6 Mobile data path drops malformed or out-of-policy packets via
+the standard skb drop-reason infrastructure (``SEG6_MOBILE_*`` in
+``include/net/dropreason-core.h``) rather than silent ``-EINVAL``, so
+the cause is observable via the ``skb:kfree_skb`` tracepoint.
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
` (6 preceding siblings ...)
2026-05-04 16:30 ` [PATCH v2 7/7] Documentation: networking: add seg6_mobile guide Yuya Kusakabe
@ 2026-05-04 23:39 ` Jakub Kicinski
2026-05-05 1:22 ` Yuya Kusakabe
7 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2026-05-04 23:39 UTC (permalink / raw)
To: Yuya Kusakabe
Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Andrea Mayer, Shuah Khan, Jonathan Corbet, Shuah Khan,
linux-kernel, netdev, linux-kselftest, linux-doc
On Tue, 05 May 2026 01:30:10 +0900 Yuya Kusakabe wrote:
> This series adds the in-kernel data path for the SRv6 Mobile User
> Plane (MUP) architecture defined in RFC 9433. SRv6 MUP integrates
> GTP-U mobile traffic into an SRv6 transport domain by mapping the
> 5-tuple (TEID, QFI, R, U, PDU Session ID) into a single SID, allowing
> operators to replace the GTP-U overlay between the gNB and the
> upstream UPF with native SRv6 forwarding while keeping the radio side
> unchanged.
Could you switch to posting this as an RFC until you gather some review
tags? Our CI require manual intervention to add the necessary iproute2
patches, I suspect there may be some uAPI changes therefore requiring
iproute2 changes here.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors
2026-05-04 23:39 ` [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Jakub Kicinski
@ 2026-05-05 1:22 ` Yuya Kusakabe
2026-05-05 1:28 ` Jakub Kicinski
0 siblings, 1 reply; 11+ messages in thread
From: Yuya Kusakabe @ 2026-05-05 1:22 UTC (permalink / raw)
To: Jakub Kicinski
Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Andrea Mayer, Shuah Khan, Jonathan Corbet, Shuah Khan,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org
2026年5月5日火曜日 Jakub Kicinski <kuba@kernel.org>:
> Could you switch to posting this as an RFC until you gather some review
> tags? Our CI require manual intervention to add the necessary iproute2
> patches, I suspect there may be some uAPI changes therefore requiring
> iproute2 changes here.
Will do. Yes, this series adds new SEG6_LOCAL_* / SEG6_LOCAL_MOBILE_*
uAPI in include/uapi/linux/seg6_local.h; the matching iproute2-next
series is posted separately:
https://lore.kernel.org/netdev/20260505-seg6-mobile-v2-0-93291b7b0134@gmail.com/
Just to confirm the workflow you'd prefer: should I repost the
current series immediately as [PATCH RFC net-next v3 ...], or wait
for technical review on v2 to land and fold it into a v3 RFC?
Thanks,
Yuya
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors
2026-05-05 1:22 ` Yuya Kusakabe
@ 2026-05-05 1:28 ` Jakub Kicinski
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2026-05-05 1:28 UTC (permalink / raw)
To: Yuya Kusakabe
Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Andrea Mayer, Shuah Khan, Jonathan Corbet, Shuah Khan,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org,
Justin Iurman
On Tue, 5 May 2026 10:22:58 +0900 Yuya Kusakabe wrote:
> Just to confirm the workflow you'd prefer: should I repost the
> current series immediately as [PATCH RFC net-next v3 ...], or wait
> for technical review on v2 to land and fold it into a v3 RFC?
Let's wait for reviews (adding Justin to CC as well FWIW)
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-05 1:28 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-04 16:30 [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 1/7] seg6: add End.MAP behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 2/7] seg6: add End.M.GTP4.E behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 3/7] seg6: add End.M.GTP6.E behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 4/7] seg6: add End.M.GTP6.D behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 5/7] seg6: add End.M.GTP6.D.Di behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 6/7] seg6: add H.M.GTP4.D behavior Yuya Kusakabe
2026-05-04 16:30 ` [PATCH v2 7/7] Documentation: networking: add seg6_mobile guide Yuya Kusakabe
2026-05-04 23:39 ` [PATCH v2 0/7] seg6: add SRv6 Mobile User Plane (RFC 9433) behaviors Jakub Kicinski
2026-05-05 1:22 ` Yuya Kusakabe
2026-05-05 1:28 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox