* [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities
@ 2026-01-26 19:48 Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
` (6 more replies)
0 siblings, 7 replies; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
IPv6 extension headers are defined to be quite open ended with few
limits. For instance, RFC8200 requires a receiver to process any
number of extension headers in a packet in any order. This flexiblity
comes at the cost of a potential Denial of Service attack. The only
thing that might mitigate the DoS attacks is the fact that packets
with extension headers experience high drop rates on the Internet so
that a DoS attack based on extension wouldn't be very effective at
Internet scale.
This patch set addresses some of the more egregious vulnerabilities
of extension headers to DoS attack.
- If sysctl.max_dst_opts_cnt or hbh_opts_cnt are set to 0 then that
disallows packets with Destination Options or Hop-by-Hop Options even
if the packet contain zero non-padding options
- Add a case for IPV6_TLV_TNL_ENCAP_LIMIT in the switch on TLV type
in ip6_parse_tlv function. This TLV is handled in tunnel processing,
however it needs to be detected in ip6_parse_tlv to properly account
for it as recognized non-padding option
- Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h so that all the
TLV definitions are in one place
- Set the default limits of non-padding Hop-by-Hop and Destination
options to 2. This means that if a packet contains more then two
non-padding options then it will be dropped. The previous limit
was 8, but that was too liberal considering that the stack only
support two Destination Options and the most Hop-by-Hop options
likely to ever be in the same packet are IOAM and JUMBO. The limit
can be increased via sysctl for private use and experimentation
- Enforce RFC8200 recommended ordering of Extension Headers. This
also enforces that any Extension Header occurs at most once
in a packet except for Destination Options that may appear
twice. The enforce_ext_hdr_order sysctl controls enforcement. If
it's set to true then order is enforced, if it's set to false then
neither order nor number of occurrences are enforced.
The enforced ordering is:
IPv6 header
Hop-by-Hop Options header
Destination Options before the Routing header
Routing header
Fragment header
Authentication header
Encapsulating Security Payload header
Destination Options header
Upper-Layer header
V4: Switch order of patches to avoid transient build failure
V5: Allow Desination Options before the Routing header, fixes
suggested by Justin Iurman
Tom Herbert (7):
ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is
ipv6: Cleanup IPv6 TLV definitions
ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch
ipv6: Set HBH and DestOpt limits to 2
ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls
ipv6: Enforce Extension Header ordering
ipv6: Document enforce_ext_hdr_order sysctl
Documentation/networking/ip-sysctl.rst | 50 +++++++++++++++++++++-----
include/net/ipv6.h | 9 +++--
include/net/netns/ipv6.h | 1 +
include/net/protocol.h | 14 ++++++++
include/uapi/linux/in6.h | 21 +++++++----
include/uapi/linux/ip6_tunnel.h | 1 -
net/ipv6/af_inet6.c | 1 +
net/ipv6/exthdrs.c | 32 ++++++++++++-----
net/ipv6/ip6_input.c | 42 ++++++++++++++++++++++
net/ipv6/reassembly.c | 1 +
net/ipv6/sysctl_net_ipv6.c | 7 ++++
net/ipv6/xfrm6_protocol.c | 2 ++
12 files changed, 153 insertions(+), 28 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 17:49 ` Justin Iurman
2026-01-27 17:50 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions Tom Herbert
` (5 subsequent siblings)
6 siblings, 2 replies; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
In IPv6 Destination options processing function check if
net->ipv6.sysctl.max_dst_opts_cnt is zero up front. If it is zero then
drop the packet since Destination Options processing is disabled.
Similarly, in IPv6 hop-by-hop options processing function check if
net->ipv6.sysctl.max_hbh_opts_cnt is zero up front. If it is zero then
drop the packet since Hop-by-Hop Options processing is disabled.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
net/ipv6/exthdrs.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 54088fa0c09d..b9d186784b96 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -301,9 +301,11 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
#endif
struct dst_entry *dst = skb_dst(skb);
struct net *net = dev_net(skb->dev);
- int extlen;
+ int extlen, max_opts_cnt;
- if (!pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
+ max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt);
+ if (!max_opts_cnt ||
+ !pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
!pskb_may_pull(skb, (skb_transport_offset(skb) +
((skb_transport_header(skb)[1] + 1) << 3)))) {
__IP6_INC_STATS(dev_net(dst_dev(dst)), idev,
@@ -322,8 +324,7 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
dstbuf = opt->dst1;
#endif
- if (ip6_parse_tlv(false, skb,
- READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt))) {
+ if (ip6_parse_tlv(false, skb, max_opts_cnt)) {
skb->transport_header += extlen;
opt = IP6CB(skb);
#if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -1033,7 +1034,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
{
struct inet6_skb_parm *opt = IP6CB(skb);
struct net *net = dev_net(skb->dev);
- int extlen;
+ int extlen, max_opts_cnt;
/*
* skb_network_header(skb) is equal to skb->data, and
@@ -1041,7 +1042,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
* sizeof(struct ipv6hdr) by definition of
* hop-by-hop options.
*/
- if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
+ max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt);
+ if (!max_opts_cnt ||
+ !pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
!pskb_may_pull(skb, (sizeof(struct ipv6hdr) +
((skb_transport_header(skb)[1] + 1) << 3)))) {
fail_and_free:
@@ -1054,8 +1057,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
goto fail_and_free;
opt->flags |= IP6SKB_HOPBYHOP;
- if (ip6_parse_tlv(true, skb,
- READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt))) {
+ if (ip6_parse_tlv(true, skb, max_opts_cnt)) {
skb->transport_header += extlen;
opt = IP6CB(skb);
opt->nhoff = sizeof(struct ipv6hdr);
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 17:51 ` Justin Iurman
2026-01-29 5:30 ` Willem de Bruijn
2026-01-26 19:48 ` [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch Tom Herbert
` (4 subsequent siblings)
6 siblings, 2 replies; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
of the TLV definitions. Label each of the TLV definitions as to whether
they are a Hop-by-Hop option, Destination option, or both.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
include/uapi/linux/in6.h | 21 ++++++++++++++-------
include/uapi/linux/ip6_tunnel.h | 1 -
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
index 5a47339ef7d7..438283dc5fde 100644
--- a/include/uapi/linux/in6.h
+++ b/include/uapi/linux/in6.h
@@ -140,14 +140,21 @@ struct in6_flowlabel_req {
/*
* IPv6 TLV options.
+ *
+ * Hop-by-Hop and Destination options share the same number space.
+ * For each option below whether it is a Hop-by-Hop option or
+ * a Destination option is indicated by HBH or DestOpt.
*/
-#define IPV6_TLV_PAD1 0
-#define IPV6_TLV_PADN 1
-#define IPV6_TLV_ROUTERALERT 5
-#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
-#define IPV6_TLV_IOAM 49 /* RFC 9486 */
-#define IPV6_TLV_JUMBO 194
-#define IPV6_TLV_HAO 201 /* home address option */
+#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
+#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
+#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
+#define IPV6_TLV_ROUTERALERT 5 /* HBH */
+#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
+#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
+ * IOAM sent and rcvd as HBH
+ */
+#define IPV6_TLV_JUMBO 194 /* HBH */
+#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
/*
* IPV6 socket options
diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
index 85182a839d42..35af4d9c35fb 100644
--- a/include/uapi/linux/ip6_tunnel.h
+++ b/include/uapi/linux/ip6_tunnel.h
@@ -6,7 +6,6 @@
#include <linux/if.h> /* For IFNAMSIZ. */
#include <linux/in6.h> /* For struct in6_addr. */
-#define IPV6_TLV_TNL_ENCAP_LIMIT 4
#define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
/* don't add encapsulation limit if one isn't present in inner packet */
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 17:52 ` Justin Iurman
2026-01-29 5:31 ` Willem de Bruijn
2026-01-26 19:48 ` [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2 Tom Herbert
` (3 subsequent siblings)
6 siblings, 2 replies; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
IPV6_TLV_TNL_ENCAP_LIMIT is a recognized Destination option that is
processed in ip_tunnel.c. Add a case for it in the switch in
ip6_parse_tlv so that it is recognized as a known option.
Also remove the unlikely around the check for max_count < 0 since the
default limits for HBH and Destination options can be less than zero.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
net/ipv6/exthdrs.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index b9d186784b96..6925cfad94d2 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -122,7 +122,7 @@ static bool ip6_parse_tlv(bool hopbyhop,
int tlv_count = 0;
int padlen = 0;
- if (unlikely(max_count < 0)) {
+ if (max_count < 0) {
disallow_unknowns = true;
max_count = -max_count;
}
@@ -202,6 +202,16 @@ static bool ip6_parse_tlv(bool hopbyhop,
if (!ipv6_dest_hao(skb, off))
return false;
break;
+#endif
+#if IS_ENABLED(CONFIG_IPV6_TUNNEL)
+ case IPV6_TLV_TNL_ENCAP_LIMIT:
+ /* The tunnel encapsulation option.
+ * This is handled in ip6_tunnel.c so
+ * we don't need to do anything here
+ * except to accept it as a recognized
+ * option
+ */
+ break;
#endif
default:
if (!ip6_tlvopt_unknown(skb, off,
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
` (2 preceding siblings ...)
2026-01-26 19:48 ` [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 17:55 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls Tom Herbert
` (2 subsequent siblings)
6 siblings, 1 reply; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Set the default limits of non-padding Hop-by-Hop and Destination
options to 2. This means that if a packet contains more then two
non-padding options then it will be dropped. The previous limit
was 8, but that was too liberal considering that the stack only
support two Destination Options and the most Hop-by-Hop options
likely to ever be in the same packet are IOAM and JUMBO. The limit
can be increased via sysctl for private use and experimenation.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
include/net/ipv6.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index c7f597da01cd..31d270c8c2e4 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -86,9 +86,12 @@ struct ip_tunnel_info;
* silently discarded.
*/
-/* Default limits for Hop-by-Hop and Destination options */
-#define IP6_DEFAULT_MAX_DST_OPTS_CNT 8
-#define IP6_DEFAULT_MAX_HBH_OPTS_CNT 8
+/* Default limits for Hop-by-Hop and Destination non-padding options. The
+ * default value for both is 2. This sets a limit at two non-padding options
+ * (see sysctl documention)
+ */
+#define IP6_DEFAULT_MAX_DST_OPTS_CNT 2
+#define IP6_DEFAULT_MAX_HBH_OPTS_CNT 2
#define IP6_DEFAULT_MAX_DST_OPTS_LEN INT_MAX /* No limit */
#define IP6_DEFAULT_MAX_HBH_OPTS_LEN INT_MAX /* No limit */
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
` (3 preceding siblings ...)
2026-01-26 19:48 ` [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2 Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 17:57 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl Tom Herbert
6 siblings, 1 reply; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
In the descriptions of max_dst_opts_number and max_hbh_opts_number
sysctls add text about how a zero setting means that a packet with
any Destination or Hop-by-Hop options is dropped.
Report the defaults for max_dst_opts_number and max_hbh_opts_number
are 2 which means up to two options may be accepted.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
Documentation/networking/ip-sysctl.rst | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index bc9a01606daf..4f568b0e39d2 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -2475,19 +2475,25 @@ mld_qrv - INTEGER
max_dst_opts_number - INTEGER
Maximum number of non-padding TLVs allowed in a Destination
- options extension header. If this value is less than zero
- then unknown options are disallowed and the number of known
- TLVs allowed is the absolute value of this number.
+ options extension header. If this value is zero then receive
+ Destination Options processing is disabled in which case packets
+ with the Destination Options extension header are dropped. If
+ this value is less than zero then unknown options are disallowed
+ and the number of known TLVs allowed is the absolute value of
+ this number.
- Default: 8
+ Default: 2
max_hbh_opts_number - INTEGER
Maximum number of non-padding TLVs allowed in a Hop-by-Hop
- options extension header. If this value is less than zero
- then unknown options are disallowed and the number of known
- TLVs allowed is the absolute value of this number.
-
- Default: 8
+ options extension header. If this value is zero then receive
+ Hop-by-Hop Options processing is disabled in which case packets
+ with the Hop-by-Hop Options extension header are dropped.
+ If this value is less than zero then unknown options are disallowed
+ and the number of known TLVs allowed is the absolute value of this
+ number.
+
+ Default: 2
max_dst_opts_length - INTEGER
Maximum length allowed for a Destination options extension
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
` (4 preceding siblings ...)
2026-01-26 19:48 ` [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 19:48 ` Justin Iurman
2026-01-29 5:18 ` Willem de Bruijn
2026-01-26 19:48 ` [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl Tom Herbert
6 siblings, 2 replies; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
RFC8200 highly recommends that different Extension Headers be send in
a prescibed order and all Extension Header types occur at most once
in a packet with the exception of Destination Options that may
occur twice. This patch enforces the ordering be folowed in received
packets.
The allowed order of Extension Headers is:
IPv6 header
Hop-by-Hop Options header
Destination Options before the Routing Header
Routing header
Fragment header
Authentication header
Encapsulating Security Payload header
Destination Options header
Upper-Layer header
Each Extension Header may be present only once in a packet.
net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
enforcement of xtension Header order. If it is set to zero then
Extension Header order and number of occurences is not checked
in receive processeing (except for Hop-by-Hop Options that
must be the first Extension Header and can only occur once in
a packet.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
include/net/netns/ipv6.h | 1 +
include/net/protocol.h | 14 +++++++++++++
net/ipv6/af_inet6.c | 1 +
net/ipv6/exthdrs.c | 2 ++
net/ipv6/ip6_input.c | 42 ++++++++++++++++++++++++++++++++++++++
net/ipv6/reassembly.c | 1 +
net/ipv6/sysctl_net_ipv6.c | 7 +++++++
net/ipv6/xfrm6_protocol.c | 2 ++
8 files changed, 70 insertions(+)
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 34bdb1308e8f..2db56718ea60 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -61,6 +61,7 @@ struct netns_sysctl_ipv6 {
u8 fib_notify_on_flag_change;
u8 icmpv6_error_anycast_as_unicast;
u8 icmpv6_errors_extension_mask;
+ u8 enforce_ext_hdr_order;
};
struct netns_ipv6 {
diff --git a/include/net/protocol.h b/include/net/protocol.h
index b2499f88f8f8..0f1676625570 100644
--- a/include/net/protocol.h
+++ b/include/net/protocol.h
@@ -50,6 +50,19 @@ struct net_protocol {
};
#if IS_ENABLED(CONFIG_IPV6)
+
+/* Order of extension headers as prescribed in RFC8200. The ordering and
+ * number of extension headers in a packet can be enforced in IPv6 receive
+ * processing.
+ */
+#define IPV6_EXT_HDR_ORDER_HOP BIT(0)
+#define IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH BIT(1)
+#define IPV6_EXT_HDR_ORDER_ROUTING BIT(2)
+#define IPV6_EXT_HDR_ORDER_FRAGMENT BIT(3)
+#define IPV6_EXT_HDR_ORDER_AUTH BIT(4)
+#define IPV6_EXT_HDR_ORDER_ESP BIT(5)
+#define IPV6_EXT_HDR_ORDER_DEST BIT(6)
+
struct inet6_protocol {
int (*handler)(struct sk_buff *skb);
@@ -61,6 +74,7 @@ struct inet6_protocol {
unsigned int flags; /* INET6_PROTO_xxx */
u32 secret;
+ u32 ext_hdr_order;
};
#define INET6_PROTO_NOPOLICY 0x1
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index bd29840659f3..43097360ce64 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -980,6 +980,7 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
net->ipv6.sysctl.fib_notify_on_flag_change = 0;
+ net->ipv6.sysctl.enforce_ext_hdr_order = 1;
atomic_set(&net->ipv6.fib6_sernum, 1);
net->ipv6.sysctl.ioam6_id = IOAM6_DEFAULT_ID;
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 6925cfad94d2..4ab94c8cddb9 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -845,11 +845,13 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
static const struct inet6_protocol rthdr_protocol = {
.handler = ipv6_rthdr_rcv,
.flags = INET6_PROTO_NOPOLICY,
+ .ext_hdr_order = IPV6_EXT_HDR_ORDER_ROUTING,
};
static const struct inet6_protocol destopt_protocol = {
.handler = ipv6_destopt_rcv,
.flags = INET6_PROTO_NOPOLICY,
+ .ext_hdr_order = IPV6_EXT_HDR_ORDER_DEST,
};
static const struct inet6_protocol nodata_protocol = {
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 168ec07e31cc..ab921c0a94af 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -355,6 +355,27 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
ip6_sublist_rcv(&sublist, curr_dev, curr_net);
}
+static u32 check_dst_opts_before_rh(const struct inet6_protocol *ipprot,
+ u32 ext_hdrs)
+{
+ /* Check if Destination Options before the Routing Header are
+ * present.
+ */
+ if (ipprot->ext_hdr_order != IPV6_EXT_HDR_ORDER_ROUTING ||
+ !(ext_hdrs | IPV6_EXT_HDR_ORDER_DEST))
+ return ext_hdrs;
+
+ /* We have Destination Options before the Routing Header. Set
+ * the mask of recived extension headers to reflect that. We promote
+ * the bit from indicating just Destination Options present to
+ * Destination Options before the Routing Header being present
+ */
+ ext_hdrs = (ext_hdrs & ~IPV6_EXT_HDR_ORDER_DEST) |
+ IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH;
+
+ return ext_hdrs;
+}
+
INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
/*
@@ -366,6 +387,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
const struct inet6_protocol *ipprot;
struct inet6_dev *idev;
unsigned int nhoff;
+ u32 ext_hdrs = 0;
SKB_DR(reason);
bool raw;
@@ -427,6 +449,26 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
goto discard;
}
}
+
+ if (ipprot->ext_hdr_order &&
+ READ_ONCE(net->ipv6.sysctl.enforce_ext_hdr_order)) {
+ /* The protocol is an extension header and EH ordering
+ * is being enforced. Discard packet if we've already
+ * seen this EH or one that is lower in the order list
+ */
+ if (ipprot->ext_hdr_order <= ext_hdrs) {
+ /* Check if there's Destination Options
+ * before the Routing Header
+ */
+ ext_hdrs = check_dst_opts_before_rh(ipprot,
+ ext_hdrs);
+ if (ipprot->ext_hdr_order <= ext_hdrs)
+ goto discard;
+ }
+
+ ext_hdrs |= ipprot->ext_hdr_order;
+ }
+
if (!(ipprot->flags & INET6_PROTO_NOPOLICY)) {
if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
SKB_DR_SET(reason, XFRM_POLICY);
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 25ec8001898d..91dba72c5a3c 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -414,6 +414,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
static const struct inet6_protocol frag_protocol = {
.handler = ipv6_frag_rcv,
.flags = INET6_PROTO_NOPOLICY,
+ .ext_hdr_order = IPV6_EXT_HDR_ORDER_FRAGMENT,
};
#ifdef CONFIG_SYSCTL
diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
index d2cd33e2698d..543b6acdb11d 100644
--- a/net/ipv6/sysctl_net_ipv6.c
+++ b/net/ipv6/sysctl_net_ipv6.c
@@ -213,6 +213,13 @@ static struct ctl_table ipv6_table_template[] = {
.proc_handler = proc_doulongvec_minmax,
.extra2 = &ioam6_id_wide_max,
},
+ {
+ .procname = "enforce_ext_hdr_order",
+ .data = &init_net.ipv6.sysctl.enforce_ext_hdr_order,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ },
};
static struct ctl_table ipv6_rotable[] = {
diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c
index ea2f805d3b01..5826edf67f64 100644
--- a/net/ipv6/xfrm6_protocol.c
+++ b/net/ipv6/xfrm6_protocol.c
@@ -197,12 +197,14 @@ static const struct inet6_protocol esp6_protocol = {
.handler = xfrm6_esp_rcv,
.err_handler = xfrm6_esp_err,
.flags = INET6_PROTO_NOPOLICY,
+ .ext_hdr_order = IPV6_EXT_HDR_ORDER_ESP,
};
static const struct inet6_protocol ah6_protocol = {
.handler = xfrm6_ah_rcv,
.err_handler = xfrm6_ah_err,
.flags = INET6_PROTO_NOPOLICY,
+ .ext_hdr_order = IPV6_EXT_HDR_ORDER_AUTH
};
static const struct inet6_protocol ipcomp6_protocol = {
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
` (5 preceding siblings ...)
2026-01-26 19:48 ` [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering Tom Herbert
@ 2026-01-26 19:48 ` Tom Herbert
2026-01-27 18:00 ` Justin Iurman
6 siblings, 1 reply; 30+ messages in thread
From: Tom Herbert @ 2026-01-26 19:48 UTC (permalink / raw)
To: davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Document the enforce_ext_hdr_order sysctl that controls whether
Extension Header order is enforced on receive.
Signed-off-by: Tom Herbert <tom@herbertland.com>
---
Documentation/networking/ip-sysctl.rst | 28 ++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 4f568b0e39d2..1b12b955fa34 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -2581,6 +2581,34 @@ ioam6_id_wide - LONG INTEGER
Default: 0xFFFFFFFFFFFFFF
+enforce_ext_hdr_order - BOOLEAN
+ Enforce recommended Extension Header ordering in RFC8200.
+ If the sysctl is set to 1 then the ordering the ordering is
+ enforced in received packets and each Extension Header
+ may be present at most once per packet. If the sysctl is
+ set to 0 then ordering is not enforced and Extension Headers
+ may be present in any order and have any number of
+ occurences per packet (except for Hop-by-Hop Options).
+
+ The Extension Header order is:
+
+ IPv6 header
+ Hop-by-Hop Options header
+ Destination Options before the Routing header
+ Routing header
+ Fragment header
+ Authentication header
+ Encapsulating Security Payload header
+ Destination Options header
+ Upper-Layer header
+
+ Possible values:
+
+ - 0 (disabled)
+ - 1 (enabled)
+
+ Default: 1 (enabled)
+
IPv6 Fragmentation:
ip6frag_high_thresh - INTEGER
--
2.43.0
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
@ 2026-01-27 17:49 ` Justin Iurman
2026-01-27 17:50 ` Justin Iurman
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:49 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> In IPv6 Destination options processing function check if
> net->ipv6.sysctl.max_dst_opts_cnt is zero up front. If it is zero then
> drop the packet since Destination Options processing is disabled.
>
> Similarly, in IPv6 hop-by-hop options processing function check if
> net->ipv6.sysctl.max_hbh_opts_cnt is zero up front. If it is zero then
> drop the packet since Hop-by-Hop Options processing is disabled.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> net/ipv6/exthdrs.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 54088fa0c09d..b9d186784b96 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -301,9 +301,11 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
> #endif
> struct dst_entry *dst = skb_dst(skb);
> struct net *net = dev_net(skb->dev);
> - int extlen;
> + int extlen, max_opts_cnt;
>
> - if (!pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
> + max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt);
> + if (!max_opts_cnt ||
> + !pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
> !pskb_may_pull(skb, (skb_transport_offset(skb) +
> ((skb_transport_header(skb)[1] + 1) << 3)))) {
> __IP6_INC_STATS(dev_net(dst_dev(dst)), idev,
> @@ -322,8 +324,7 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
> dstbuf = opt->dst1;
> #endif
>
> - if (ip6_parse_tlv(false, skb,
> - READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt))) {
> + if (ip6_parse_tlv(false, skb, max_opts_cnt)) {
> skb->transport_header += extlen;
> opt = IP6CB(skb);
> #if IS_ENABLED(CONFIG_IPV6_MIP6)
> @@ -1033,7 +1034,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> {
> struct inet6_skb_parm *opt = IP6CB(skb);
> struct net *net = dev_net(skb->dev);
> - int extlen;
> + int extlen, max_opts_cnt;
>
> /*
> * skb_network_header(skb) is equal to skb->data, and
> @@ -1041,7 +1042,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> * sizeof(struct ipv6hdr) by definition of
> * hop-by-hop options.
> */
> - if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
> + max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt);
> + if (!max_opts_cnt ||
> + !pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
> !pskb_may_pull(skb, (sizeof(struct ipv6hdr) +
> ((skb_transport_header(skb)[1] + 1) << 3)))) {
> fail_and_free:
> @@ -1054,8 +1057,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> goto fail_and_free;
>
> opt->flags |= IP6SKB_HOPBYHOP;
> - if (ip6_parse_tlv(true, skb,
> - READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt))) {
> + if (ip6_parse_tlv(true, skb, max_opts_cnt)) {
> skb->transport_header += extlen;
> opt = IP6CB(skb);
> opt->nhoff = sizeof(struct ipv6hdr);
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
2026-01-27 17:49 ` Justin Iurman
@ 2026-01-27 17:50 ` Justin Iurman
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:50 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> In IPv6 Destination options processing function check if
> net->ipv6.sysctl.max_dst_opts_cnt is zero up front. If it is zero then
> drop the packet since Destination Options processing is disabled.
>
> Similarly, in IPv6 hop-by-hop options processing function check if
> net->ipv6.sysctl.max_hbh_opts_cnt is zero up front. If it is zero then
> drop the packet since Hop-by-Hop Options processing is disabled.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> net/ipv6/exthdrs.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 54088fa0c09d..b9d186784b96 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -301,9 +301,11 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
> #endif
> struct dst_entry *dst = skb_dst(skb);
> struct net *net = dev_net(skb->dev);
> - int extlen;
> + int extlen, max_opts_cnt;
>
> - if (!pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
> + max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt);
> + if (!max_opts_cnt ||
> + !pskb_may_pull(skb, skb_transport_offset(skb) + 8) ||
> !pskb_may_pull(skb, (skb_transport_offset(skb) +
> ((skb_transport_header(skb)[1] + 1) << 3)))) {
> __IP6_INC_STATS(dev_net(dst_dev(dst)), idev,
> @@ -322,8 +324,7 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
> dstbuf = opt->dst1;
> #endif
>
> - if (ip6_parse_tlv(false, skb,
> - READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt))) {
> + if (ip6_parse_tlv(false, skb, max_opts_cnt)) {
> skb->transport_header += extlen;
> opt = IP6CB(skb);
> #if IS_ENABLED(CONFIG_IPV6_MIP6)
> @@ -1033,7 +1034,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> {
> struct inet6_skb_parm *opt = IP6CB(skb);
> struct net *net = dev_net(skb->dev);
> - int extlen;
> + int extlen, max_opts_cnt;
>
> /*
> * skb_network_header(skb) is equal to skb->data, and
> @@ -1041,7 +1042,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> * sizeof(struct ipv6hdr) by definition of
> * hop-by-hop options.
> */
> - if (!pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
> + max_opts_cnt = READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt);
> + if (!max_opts_cnt ||
> + !pskb_may_pull(skb, sizeof(struct ipv6hdr) + 8) ||
> !pskb_may_pull(skb, (sizeof(struct ipv6hdr) +
> ((skb_transport_header(skb)[1] + 1) << 3)))) {
> fail_and_free:
> @@ -1054,8 +1057,7 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
> goto fail_and_free;
>
> opt->flags |= IP6SKB_HOPBYHOP;
> - if (ip6_parse_tlv(true, skb,
> - READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt))) {
> + if (ip6_parse_tlv(true, skb, max_opts_cnt)) {
> skb->transport_header += extlen;
> opt = IP6CB(skb);
> opt->nhoff = sizeof(struct ipv6hdr);
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-26 19:48 ` [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions Tom Herbert
@ 2026-01-27 17:51 ` Justin Iurman
2026-01-29 5:30 ` Willem de Bruijn
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:51 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
> of the TLV definitions. Label each of the TLV definitions as to whether
> they are a Hop-by-Hop option, Destination option, or both.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> include/uapi/linux/in6.h | 21 ++++++++++++++-------
> include/uapi/linux/ip6_tunnel.h | 1 -
> 2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
> index 5a47339ef7d7..438283dc5fde 100644
> --- a/include/uapi/linux/in6.h
> +++ b/include/uapi/linux/in6.h
> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
>
> /*
> * IPv6 TLV options.
> + *
> + * Hop-by-Hop and Destination options share the same number space.
> + * For each option below whether it is a Hop-by-Hop option or
> + * a Destination option is indicated by HBH or DestOpt.
> */
> -#define IPV6_TLV_PAD1 0
> -#define IPV6_TLV_PADN 1
> -#define IPV6_TLV_ROUTERALERT 5
> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
> -#define IPV6_TLV_JUMBO 194
> -#define IPV6_TLV_HAO 201 /* home address option */
> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
> + * IOAM sent and rcvd as HBH
> + */
> +#define IPV6_TLV_JUMBO 194 /* HBH */
> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
>
> /*
> * IPV6 socket options
> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
> index 85182a839d42..35af4d9c35fb 100644
> --- a/include/uapi/linux/ip6_tunnel.h
> +++ b/include/uapi/linux/ip6_tunnel.h
> @@ -6,7 +6,6 @@
> #include <linux/if.h> /* For IFNAMSIZ. */
> #include <linux/in6.h> /* For struct in6_addr. */
>
> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
>
> /* don't add encapsulation limit if one isn't present in inner packet */
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch
2026-01-26 19:48 ` [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch Tom Herbert
@ 2026-01-27 17:52 ` Justin Iurman
2026-01-29 5:31 ` Willem de Bruijn
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:52 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> IPV6_TLV_TNL_ENCAP_LIMIT is a recognized Destination option that is
> processed in ip_tunnel.c. Add a case for it in the switch in
> ip6_parse_tlv so that it is recognized as a known option.
>
> Also remove the unlikely around the check for max_count < 0 since the
> default limits for HBH and Destination options can be less than zero.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> net/ipv6/exthdrs.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index b9d186784b96..6925cfad94d2 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -122,7 +122,7 @@ static bool ip6_parse_tlv(bool hopbyhop,
> int tlv_count = 0;
> int padlen = 0;
>
> - if (unlikely(max_count < 0)) {
> + if (max_count < 0) {
> disallow_unknowns = true;
> max_count = -max_count;
> }
> @@ -202,6 +202,16 @@ static bool ip6_parse_tlv(bool hopbyhop,
> if (!ipv6_dest_hao(skb, off))
> return false;
> break;
> +#endif
> +#if IS_ENABLED(CONFIG_IPV6_TUNNEL)
> + case IPV6_TLV_TNL_ENCAP_LIMIT:
> + /* The tunnel encapsulation option.
> + * This is handled in ip6_tunnel.c so
> + * we don't need to do anything here
> + * except to accept it as a recognized
> + * option
> + */
> + break;
> #endif
> default:
> if (!ip6_tlvopt_unknown(skb, off,
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2
2026-01-26 19:48 ` [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2 Tom Herbert
@ 2026-01-27 17:55 ` Justin Iurman
0 siblings, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:55 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> Set the default limits of non-padding Hop-by-Hop and Destination
> options to 2. This means that if a packet contains more then two
> non-padding options then it will be dropped. The previous limit
> was 8, but that was too liberal considering that the stack only
> support two Destination Options and the most Hop-by-Hop options
> likely to ever be in the same packet are IOAM and JUMBO. The limit
> can be increased via sysctl for private use and experimenation.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> include/net/ipv6.h | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index c7f597da01cd..31d270c8c2e4 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -86,9 +86,12 @@ struct ip_tunnel_info;
> * silently discarded.
> */
>
> -/* Default limits for Hop-by-Hop and Destination options */
> -#define IP6_DEFAULT_MAX_DST_OPTS_CNT 8
> -#define IP6_DEFAULT_MAX_HBH_OPTS_CNT 8
> +/* Default limits for Hop-by-Hop and Destination non-padding options. The
> + * default value for both is 2. This sets a limit at two non-padding options
> + * (see sysctl documention)
Reported by AI:
s/documention/documentation
I guess there's no need to respin only for that?
> + */
> +#define IP6_DEFAULT_MAX_DST_OPTS_CNT 2
> +#define IP6_DEFAULT_MAX_HBH_OPTS_CNT 2
> #define IP6_DEFAULT_MAX_DST_OPTS_LEN INT_MAX /* No limit */
> #define IP6_DEFAULT_MAX_HBH_OPTS_LEN INT_MAX /* No limit */
>
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls
2026-01-26 19:48 ` [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls Tom Herbert
@ 2026-01-27 17:57 ` Justin Iurman
0 siblings, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 17:57 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> In the descriptions of max_dst_opts_number and max_hbh_opts_number
> sysctls add text about how a zero setting means that a packet with
> any Destination or Hop-by-Hop options is dropped.
>
> Report the defaults for max_dst_opts_number and max_hbh_opts_number
> are 2 which means up to two options may be accepted.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> Documentation/networking/ip-sysctl.rst | 24 +++++++++++++++---------
> 1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index bc9a01606daf..4f568b0e39d2 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -2475,19 +2475,25 @@ mld_qrv - INTEGER
>
> max_dst_opts_number - INTEGER
> Maximum number of non-padding TLVs allowed in a Destination
> - options extension header. If this value is less than zero
> - then unknown options are disallowed and the number of known
> - TLVs allowed is the absolute value of this number.
> + options extension header. If this value is zero then receive
> + Destination Options processing is disabled in which case packets
> + with the Destination Options extension header are dropped. If
> + this value is less than zero then unknown options are disallowed
> + and the number of known TLVs allowed is the absolute value of
> + this number.
>
> - Default: 8
> + Default: 2
>
> max_hbh_opts_number - INTEGER
> Maximum number of non-padding TLVs allowed in a Hop-by-Hop
> - options extension header. If this value is less than zero
> - then unknown options are disallowed and the number of known
> - TLVs allowed is the absolute value of this number.
> -
> - Default: 8
> + options extension header. If this value is zero then receive
> + Hop-by-Hop Options processing is disabled in which case packets
> + with the Hop-by-Hop Options extension header are dropped.
> + If this value is less than zero then unknown options are disallowed
> + and the number of known TLVs allowed is the absolute value of this
> + number.
> +
> + Default: 2
>
> max_dst_opts_length - INTEGER
> Maximum length allowed for a Destination options extension
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl
2026-01-26 19:48 ` [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl Tom Herbert
@ 2026-01-27 18:00 ` Justin Iurman
0 siblings, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 18:00 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> Document the enforce_ext_hdr_order sysctl that controls whether
> Extension Header order is enforced on receive.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> Documentation/networking/ip-sysctl.rst | 28 ++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
> index 4f568b0e39d2..1b12b955fa34 100644
> --- a/Documentation/networking/ip-sysctl.rst
> +++ b/Documentation/networking/ip-sysctl.rst
> @@ -2581,6 +2581,34 @@ ioam6_id_wide - LONG INTEGER
>
> Default: 0xFFFFFFFFFFFFFF
>
> +enforce_ext_hdr_order - BOOLEAN
> + Enforce recommended Extension Header ordering in RFC8200.
> + If the sysctl is set to 1 then the ordering the ordering is
Reported by AI:
s/the ordering the ordering/the ordering
> + enforced in received packets and each Extension Header
> + may be present at most once per packet. If the sysctl is
> + set to 0 then ordering is not enforced and Extension Headers
> + may be present in any order and have any number of
> + occurences per packet (except for Hop-by-Hop Options).
Reported by AI:
s/occurences/occurrences
> +
> + The Extension Header order is:
> +
> + IPv6 header
> + Hop-by-Hop Options header
> + Destination Options before the Routing header
> + Routing header
> + Fragment header
> + Authentication header
> + Encapsulating Security Payload header
> + Destination Options header
> + Upper-Layer header
> +
> + Possible values:
> +
> + - 0 (disabled)
> + - 1 (enabled)
> +
> + Default: 1 (enabled)
> +
> IPv6 Fragmentation:
>
> ip6frag_high_thresh - INTEGER
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-26 19:48 ` [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering Tom Herbert
@ 2026-01-27 19:48 ` Justin Iurman
2026-01-29 5:18 ` Willem de Bruijn
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-27 19:48 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev
On 1/26/26 20:48, Tom Herbert wrote:
> RFC8200 highly recommends that different Extension Headers be send in
> a prescibed order and all Extension Header types occur at most once
> in a packet with the exception of Destination Options that may
> occur twice. This patch enforces the ordering be folowed in received
> packets.
>
> The allowed order of Extension Headers is:
>
> IPv6 header
> Hop-by-Hop Options header
> Destination Options before the Routing Header
> Routing header
> Fragment header
> Authentication header
> Encapsulating Security Payload header
> Destination Options header
> Upper-Layer header
>
> Each Extension Header may be present only once in a packet.
>
> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> enforcement of xtension Header order. If it is set to zero then
> Extension Header order and number of occurences is not checked
> in receive processeing (except for Hop-by-Hop Options that
> must be the first Extension Header and can only occur once in
> a packet.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> include/net/netns/ipv6.h | 1 +
> include/net/protocol.h | 14 +++++++++++++
> net/ipv6/af_inet6.c | 1 +
> net/ipv6/exthdrs.c | 2 ++
> net/ipv6/ip6_input.c | 42 ++++++++++++++++++++++++++++++++++++++
> net/ipv6/reassembly.c | 1 +
> net/ipv6/sysctl_net_ipv6.c | 7 +++++++
> net/ipv6/xfrm6_protocol.c | 2 ++
> 8 files changed, 70 insertions(+)
>
> diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
> index 34bdb1308e8f..2db56718ea60 100644
> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h
> @@ -61,6 +61,7 @@ struct netns_sysctl_ipv6 {
> u8 fib_notify_on_flag_change;
> u8 icmpv6_error_anycast_as_unicast;
> u8 icmpv6_errors_extension_mask;
> + u8 enforce_ext_hdr_order;
> };
>
> struct netns_ipv6 {
> diff --git a/include/net/protocol.h b/include/net/protocol.h
> index b2499f88f8f8..0f1676625570 100644
> --- a/include/net/protocol.h
> +++ b/include/net/protocol.h
> @@ -50,6 +50,19 @@ struct net_protocol {
> };
>
> #if IS_ENABLED(CONFIG_IPV6)
> +
> +/* Order of extension headers as prescribed in RFC8200. The ordering and
> + * number of extension headers in a packet can be enforced in IPv6 receive
> + * processing.
> + */
> +#define IPV6_EXT_HDR_ORDER_HOP BIT(0)
> +#define IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH BIT(1)
> +#define IPV6_EXT_HDR_ORDER_ROUTING BIT(2)
> +#define IPV6_EXT_HDR_ORDER_FRAGMENT BIT(3)
> +#define IPV6_EXT_HDR_ORDER_AUTH BIT(4)
> +#define IPV6_EXT_HDR_ORDER_ESP BIT(5)
> +#define IPV6_EXT_HDR_ORDER_DEST BIT(6)
> +
> struct inet6_protocol {
> int (*handler)(struct sk_buff *skb);
>
> @@ -61,6 +74,7 @@ struct inet6_protocol {
>
> unsigned int flags; /* INET6_PROTO_xxx */
> u32 secret;
> + u32 ext_hdr_order;
> };
>
> #define INET6_PROTO_NOPOLICY 0x1
> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> index bd29840659f3..43097360ce64 100644
> --- a/net/ipv6/af_inet6.c
> +++ b/net/ipv6/af_inet6.c
> @@ -980,6 +980,7 @@ static int __net_init inet6_net_init(struct net *net)
> net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
> net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
> net->ipv6.sysctl.fib_notify_on_flag_change = 0;
> + net->ipv6.sysctl.enforce_ext_hdr_order = 1;
> atomic_set(&net->ipv6.fib6_sernum, 1);
>
> net->ipv6.sysctl.ioam6_id = IOAM6_DEFAULT_ID;
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 6925cfad94d2..4ab94c8cddb9 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -845,11 +845,13 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
> static const struct inet6_protocol rthdr_protocol = {
> .handler = ipv6_rthdr_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ROUTING,
> };
>
> static const struct inet6_protocol destopt_protocol = {
> .handler = ipv6_destopt_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_DEST,
> };
>
> static const struct inet6_protocol nodata_protocol = {
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index 168ec07e31cc..ab921c0a94af 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -355,6 +355,27 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
> ip6_sublist_rcv(&sublist, curr_dev, curr_net);
> }
>
> +static u32 check_dst_opts_before_rh(const struct inet6_protocol *ipprot,
> + u32 ext_hdrs)
> +{
> + /* Check if Destination Options before the Routing Header are
> + * present.
> + */
> + if (ipprot->ext_hdr_order != IPV6_EXT_HDR_ORDER_ROUTING ||
> + !(ext_hdrs | IPV6_EXT_HDR_ORDER_DEST))
Just curious, did you test it? Seems weird, not sure about this code. By
the way, how about adding a selftest (python+scapy) to test all EH
combinations? Should be straightforward. Could be added later, not
necessarily in this series.
I think you inverted IPV6_EXT_HDR_ORDER_ROUTING and
IPV6_EXT_HDR_ORDER_DEST. Also, (ext_hdrs | IPV6_EXT_HDR_ORDER_DEST)
should probably be (ext_hdrs & IPV6_EXT_HDR_ORDER_DEST) instead. See my
proposal below.
> + return ext_hdrs;
> +
> + /* We have Destination Options before the Routing Header. Set
> + * the mask of recived extension headers to reflect that. We promote
Reported by AI:
s/recived/received
(in case you keep this part)
> + * the bit from indicating just Destination Options present to
> + * Destination Options before the Routing Header being present
> + */
> + ext_hdrs = (ext_hdrs & ~IPV6_EXT_HDR_ORDER_DEST) |
> + IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH;
> +
> + return ext_hdrs;
> +}
> +
> INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
>
> /*
> @@ -366,6 +387,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
> const struct inet6_protocol *ipprot;
> struct inet6_dev *idev;
> unsigned int nhoff;
> + u32 ext_hdrs = 0;
> SKB_DR(reason);
> bool raw;
>
> @@ -427,6 +449,26 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
> goto discard;
> }
> }
> +
> + if (ipprot->ext_hdr_order &&
> + READ_ONCE(net->ipv6.sysctl.enforce_ext_hdr_order)) {
> + /* The protocol is an extension header and EH ordering
> + * is being enforced. Discard packet if we've already
> + * seen this EH or one that is lower in the order list
> + */
> + if (ipprot->ext_hdr_order <= ext_hdrs) {
> + /* Check if there's Destination Options
> + * before the Routing Header
> + */
> + ext_hdrs = check_dst_opts_before_rh(ipprot,
> + ext_hdrs);
> + if (ipprot->ext_hdr_order <= ext_hdrs)
> + goto discard;
> + }
> +
> + ext_hdrs |= ipprot->ext_hdr_order;
> + }
> +
Didn't test your version, neither did I with this attempt, feedback welcome:
#define IPV6_EXT_HDR_ORDER_HOP BIT(0)
#define IPV6_EXT_HDR_ORDER_DEST BIT(1)
#define IPV6_EXT_HDR_ORDER_ROUTING BIT(2)
#define IPV6_EXT_HDR_ORDER_FRAGMENT BIT(3)
#define IPV6_EXT_HDR_ORDER_AUTH BIT(4)
#define IPV6_EXT_HDR_ORDER_ESP BIT(5)
#define IPV6_EXT_HDR_ORDER_DEST2 BIT(6)
[...]
static const struct inet6_protocol destopt_protocol = {
.handler = ipv6_destopt_rcv,
.flags = INET6_PROTO_NOPOLICY,
.ext_hdr_order = IPV6_EXT_HDR_ORDER_DEST,
};
[...]
u32 ext_hdrs;
if (ipprot->ext_hdr_order) {
if (ipprot->ext_hdr_order <= ext_hdrs) {
if ((ext_hdrs & IPV6_EXT_HDR_ORDER_DEST2) ||
ipprot->ext_hdr_order != IPV6_EXT_HDR_ORDER_DEST ||
!(ext_hdrs & IPV6_EXT_HDR_ORDER_ROUTING)) {
goto discard;
}
if (ipprot->ext_hdr_order == IPV6_EXT_HDR_ORDER_DEST)
ext_hdrs |= IPV6_EXT_HDR_ORDER_DEST2;
}
ext_hdrs |= ipprot->ext_hdr_order;
}
... which looks quite similar to yours without the error mentioned above.
> if (!(ipprot->flags & INET6_PROTO_NOPOLICY)) {
> if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
> SKB_DR_SET(reason, XFRM_POLICY);
> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
> index 25ec8001898d..91dba72c5a3c 100644
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -414,6 +414,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
> static const struct inet6_protocol frag_protocol = {
> .handler = ipv6_frag_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_FRAGMENT,
> };
>
> #ifdef CONFIG_SYSCTL
> diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
> index d2cd33e2698d..543b6acdb11d 100644
> --- a/net/ipv6/sysctl_net_ipv6.c
> +++ b/net/ipv6/sysctl_net_ipv6.c
> @@ -213,6 +213,13 @@ static struct ctl_table ipv6_table_template[] = {
> .proc_handler = proc_doulongvec_minmax,
> .extra2 = &ioam6_id_wide_max,
> },
> + {
> + .procname = "enforce_ext_hdr_order",
> + .data = &init_net.ipv6.sysctl.enforce_ext_hdr_order,
> + .maxlen = sizeof(u8),
> + .mode = 0644,
> + .proc_handler = proc_dou8vec_minmax,
> + },
> };
>
> static struct ctl_table ipv6_rotable[] = {
> diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c
> index ea2f805d3b01..5826edf67f64 100644
> --- a/net/ipv6/xfrm6_protocol.c
> +++ b/net/ipv6/xfrm6_protocol.c
> @@ -197,12 +197,14 @@ static const struct inet6_protocol esp6_protocol = {
> .handler = xfrm6_esp_rcv,
> .err_handler = xfrm6_esp_err,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ESP,
> };
>
> static const struct inet6_protocol ah6_protocol = {
> .handler = xfrm6_ah_rcv,
> .err_handler = xfrm6_ah_err,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_AUTH
> };
>
> static const struct inet6_protocol ipcomp6_protocol = {
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-26 19:48 ` [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering Tom Herbert
2026-01-27 19:48 ` Justin Iurman
@ 2026-01-29 5:18 ` Willem de Bruijn
2026-01-29 18:07 ` Justin Iurman
1 sibling, 1 reply; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-29 5:18 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Tom Herbert wrote:
> RFC8200 highly recommends that different Extension Headers be send in
> a prescibed order and all Extension Header types occur at most once
> in a packet with the exception of Destination Options that may
> occur twice. This patch enforces the ordering be folowed in received
> packets.
>
> The allowed order of Extension Headers is:
>
> IPv6 header
> Hop-by-Hop Options header
> Destination Options before the Routing Header
> Routing header
> Fragment header
> Authentication header
> Encapsulating Security Payload header
> Destination Options header
> Upper-Layer header
>
> Each Extension Header may be present only once in a packet.
>
> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> enforcement of xtension Header order. If it is set to zero then
[e]xtension. There are a few more typos in the various commit
messages.
> Extension Header order and number of occurences is not checked
> in receive processeing (except for Hop-by-Hop Options that
> must be the first Extension Header and can only occur once in
> a packet.
RFC 8200 also states
"IPv6 nodes must accept and attempt to process extension headers in
any order and occurring any number of times in the same packet,
except for the Hop-by-Hop Options header, which is restricted to
appear immediately after an IPv6 header only. Nonetheless, it is
strongly advised that sources of IPv6 packets adhere to the above
recommended order until and unless subsequent specifications revise
that recommendation."
A case of be strict in what you send, liberal in what you accept.
This new sysctl has a chance of breaking existing users.
The series as a whole is framed as a security improvement. Does
enforcing order help with that?
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> include/net/netns/ipv6.h | 1 +
> include/net/protocol.h | 14 +++++++++++++
> net/ipv6/af_inet6.c | 1 +
> net/ipv6/exthdrs.c | 2 ++
> net/ipv6/ip6_input.c | 42 ++++++++++++++++++++++++++++++++++++++
> net/ipv6/reassembly.c | 1 +
> net/ipv6/sysctl_net_ipv6.c | 7 +++++++
> net/ipv6/xfrm6_protocol.c | 2 ++
> 8 files changed, 70 insertions(+)
>
> diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
> index 34bdb1308e8f..2db56718ea60 100644
> --- a/include/net/netns/ipv6.h
> +++ b/include/net/netns/ipv6.h
> @@ -61,6 +61,7 @@ struct netns_sysctl_ipv6 {
> u8 fib_notify_on_flag_change;
> u8 icmpv6_error_anycast_as_unicast;
> u8 icmpv6_errors_extension_mask;
> + u8 enforce_ext_hdr_order;
> };
>
> struct netns_ipv6 {
> diff --git a/include/net/protocol.h b/include/net/protocol.h
> index b2499f88f8f8..0f1676625570 100644
> --- a/include/net/protocol.h
> +++ b/include/net/protocol.h
> @@ -50,6 +50,19 @@ struct net_protocol {
> };
>
> #if IS_ENABLED(CONFIG_IPV6)
> +
> +/* Order of extension headers as prescribed in RFC8200. The ordering and
> + * number of extension headers in a packet can be enforced in IPv6 receive
> + * processing.
> + */
> +#define IPV6_EXT_HDR_ORDER_HOP BIT(0)
> +#define IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH BIT(1)
> +#define IPV6_EXT_HDR_ORDER_ROUTING BIT(2)
> +#define IPV6_EXT_HDR_ORDER_FRAGMENT BIT(3)
> +#define IPV6_EXT_HDR_ORDER_AUTH BIT(4)
> +#define IPV6_EXT_HDR_ORDER_ESP BIT(5)
> +#define IPV6_EXT_HDR_ORDER_DEST BIT(6)
> +
> struct inet6_protocol {
> int (*handler)(struct sk_buff *skb);
>
> @@ -61,6 +74,7 @@ struct inet6_protocol {
>
> unsigned int flags; /* INET6_PROTO_xxx */
> u32 secret;
> + u32 ext_hdr_order;
> };
>
> #define INET6_PROTO_NOPOLICY 0x1
> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> index bd29840659f3..43097360ce64 100644
> --- a/net/ipv6/af_inet6.c
> +++ b/net/ipv6/af_inet6.c
> @@ -980,6 +980,7 @@ static int __net_init inet6_net_init(struct net *net)
> net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
> net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
> net->ipv6.sysctl.fib_notify_on_flag_change = 0;
> + net->ipv6.sysctl.enforce_ext_hdr_order = 1;
> atomic_set(&net->ipv6.fib6_sernum, 1);
>
> net->ipv6.sysctl.ioam6_id = IOAM6_DEFAULT_ID;
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 6925cfad94d2..4ab94c8cddb9 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -845,11 +845,13 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
> static const struct inet6_protocol rthdr_protocol = {
> .handler = ipv6_rthdr_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ROUTING,
> };
>
> static const struct inet6_protocol destopt_protocol = {
> .handler = ipv6_destopt_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_DEST,
> };
>
> static const struct inet6_protocol nodata_protocol = {
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index 168ec07e31cc..ab921c0a94af 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -355,6 +355,27 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
> ip6_sublist_rcv(&sublist, curr_dev, curr_net);
> }
>
> +static u32 check_dst_opts_before_rh(const struct inet6_protocol *ipprot,
> + u32 ext_hdrs)
> +{
> + /* Check if Destination Options before the Routing Header are
> + * present.
> + */
> + if (ipprot->ext_hdr_order != IPV6_EXT_HDR_ORDER_ROUTING ||
> + !(ext_hdrs | IPV6_EXT_HDR_ORDER_DEST))
> + return ext_hdrs;
> +
> + /* We have Destination Options before the Routing Header. Set
> + * the mask of recived extension headers to reflect that. We promote
> + * the bit from indicating just Destination Options present to
> + * Destination Options before the Routing Header being present
> + */
> + ext_hdrs = (ext_hdrs & ~IPV6_EXT_HDR_ORDER_DEST) |
> + IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH;
> +
> + return ext_hdrs;
> +}
> +
> INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
>
> /*
> @@ -366,6 +387,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
> const struct inet6_protocol *ipprot;
> struct inet6_dev *idev;
> unsigned int nhoff;
> + u32 ext_hdrs = 0;
> SKB_DR(reason);
> bool raw;
>
> @@ -427,6 +449,26 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
> goto discard;
> }
> }
> +
> + if (ipprot->ext_hdr_order &&
> + READ_ONCE(net->ipv6.sysctl.enforce_ext_hdr_order)) {
> + /* The protocol is an extension header and EH ordering
> + * is being enforced. Discard packet if we've already
> + * seen this EH or one that is lower in the order list
> + */
> + if (ipprot->ext_hdr_order <= ext_hdrs) {
> + /* Check if there's Destination Options
> + * before the Routing Header
> + */
> + ext_hdrs = check_dst_opts_before_rh(ipprot,
> + ext_hdrs);
> + if (ipprot->ext_hdr_order <= ext_hdrs)
> + goto discard;
> + }
> +
> + ext_hdrs |= ipprot->ext_hdr_order;
> + }
> +
> if (!(ipprot->flags & INET6_PROTO_NOPOLICY)) {
> if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
> SKB_DR_SET(reason, XFRM_POLICY);
> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
> index 25ec8001898d..91dba72c5a3c 100644
> --- a/net/ipv6/reassembly.c
> +++ b/net/ipv6/reassembly.c
> @@ -414,6 +414,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
> static const struct inet6_protocol frag_protocol = {
> .handler = ipv6_frag_rcv,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_FRAGMENT,
> };
>
> #ifdef CONFIG_SYSCTL
> diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
> index d2cd33e2698d..543b6acdb11d 100644
> --- a/net/ipv6/sysctl_net_ipv6.c
> +++ b/net/ipv6/sysctl_net_ipv6.c
> @@ -213,6 +213,13 @@ static struct ctl_table ipv6_table_template[] = {
> .proc_handler = proc_doulongvec_minmax,
> .extra2 = &ioam6_id_wide_max,
> },
> + {
> + .procname = "enforce_ext_hdr_order",
> + .data = &init_net.ipv6.sysctl.enforce_ext_hdr_order,
> + .maxlen = sizeof(u8),
> + .mode = 0644,
> + .proc_handler = proc_dou8vec_minmax,
> + },
> };
>
> static struct ctl_table ipv6_rotable[] = {
> diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c
> index ea2f805d3b01..5826edf67f64 100644
> --- a/net/ipv6/xfrm6_protocol.c
> +++ b/net/ipv6/xfrm6_protocol.c
> @@ -197,12 +197,14 @@ static const struct inet6_protocol esp6_protocol = {
> .handler = xfrm6_esp_rcv,
> .err_handler = xfrm6_esp_err,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ESP,
> };
>
> static const struct inet6_protocol ah6_protocol = {
> .handler = xfrm6_ah_rcv,
> .err_handler = xfrm6_ah_err,
> .flags = INET6_PROTO_NOPOLICY,
> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_AUTH
> };
>
> static const struct inet6_protocol ipcomp6_protocol = {
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-26 19:48 ` [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions Tom Herbert
2026-01-27 17:51 ` Justin Iurman
@ 2026-01-29 5:30 ` Willem de Bruijn
2026-01-29 18:13 ` Justin Iurman
1 sibling, 1 reply; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-29 5:30 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Tom Herbert wrote:
> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
> of the TLV definitions. Label each of the TLV definitions as to whether
> they are a Hop-by-Hop option, Destination option, or both.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> include/uapi/linux/in6.h | 21 ++++++++++++++-------
> include/uapi/linux/ip6_tunnel.h | 1 -
> 2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
> index 5a47339ef7d7..438283dc5fde 100644
> --- a/include/uapi/linux/in6.h
> +++ b/include/uapi/linux/in6.h
> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
>
> /*
> * IPv6 TLV options.
> + *
> + * Hop-by-Hop and Destination options share the same number space.
> + * For each option below whether it is a Hop-by-Hop option or
> + * a Destination option is indicated by HBH or DestOpt.
> */
> -#define IPV6_TLV_PAD1 0
> -#define IPV6_TLV_PADN 1
> -#define IPV6_TLV_ROUTERALERT 5
> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
> -#define IPV6_TLV_JUMBO 194
> -#define IPV6_TLV_HAO 201 /* home address option */
> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
> + * IOAM sent and rcvd as HBH
Explicit labeling with HBH or Destopt is quite informative.
Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
in the Destopt branch? RFC 9486 indeed did reserve a number.
> + */
> +#define IPV6_TLV_JUMBO 194 /* HBH */
> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
>
> /*
> * IPV6 socket options
> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
> index 85182a839d42..35af4d9c35fb 100644
> --- a/include/uapi/linux/ip6_tunnel.h
> +++ b/include/uapi/linux/ip6_tunnel.h
> @@ -6,7 +6,6 @@
> #include <linux/if.h> /* For IFNAMSIZ. */
> #include <linux/in6.h> /* For struct in6_addr. */
>
> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
>
> /* don't add encapsulation limit if one isn't present in inner packet */
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch
2026-01-26 19:48 ` [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch Tom Herbert
2026-01-27 17:52 ` Justin Iurman
@ 2026-01-29 5:31 ` Willem de Bruijn
1 sibling, 0 replies; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-29 5:31 UTC (permalink / raw)
To: Tom Herbert, davem, kuba, netdev, justin.iurman; +Cc: Tom Herbert
Tom Herbert wrote:
> IPV6_TLV_TNL_ENCAP_LIMIT is a recognized Destination option that is
> processed in ip_tunnel.c. Add a case for it in the switch in
> ip6_parse_tlv so that it is recognized as a known option.
>
> Also remove the unlikely around the check for max_count < 0 since the
> default limits for HBH and Destination options can be less than zero.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-29 5:18 ` Willem de Bruijn
@ 2026-01-29 18:07 ` Justin Iurman
2026-01-29 19:05 ` Willem de Bruijn
0 siblings, 1 reply; 30+ messages in thread
From: Justin Iurman @ 2026-01-29 18:07 UTC (permalink / raw)
To: Willem de Bruijn, Tom Herbert, davem, kuba, netdev
On 1/29/26 06:18, Willem de Bruijn wrote:
> Tom Herbert wrote:
>> RFC8200 highly recommends that different Extension Headers be send in
>> a prescibed order and all Extension Header types occur at most once
>> in a packet with the exception of Destination Options that may
>> occur twice. This patch enforces the ordering be folowed in received
>> packets.
>>
>> The allowed order of Extension Headers is:
>>
>> IPv6 header
>> Hop-by-Hop Options header
>> Destination Options before the Routing Header
>> Routing header
>> Fragment header
>> Authentication header
>> Encapsulating Security Payload header
>> Destination Options header
>> Upper-Layer header
>>
>> Each Extension Header may be present only once in a packet.
>>
>> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
>> enforcement of xtension Header order. If it is set to zero then
>
> [e]xtension. There are a few more typos in the various commit
> messages.
>
>> Extension Header order and number of occurences is not checked
>> in receive processeing (except for Hop-by-Hop Options that
>> must be the first Extension Header and can only occur once in
>> a packet.
>
> RFC 8200 also states
>
> "IPv6 nodes must accept and attempt to process extension headers in
> any order and occurring any number of times in the same packet,
> except for the Hop-by-Hop Options header, which is restricted to
> appear immediately after an IPv6 header only. Nonetheless, it is
> strongly advised that sources of IPv6 packets adhere to the above
> recommended order until and unless subsequent specifications revise
> that recommendation."
>
> A case of be strict in what you send, liberal in what you accept.
>
> This new sysctl has a chance of breaking existing users.
Willem,
Note that RFC8200 does not use normative language, which is part of the
problem. It could theoretically break existing users, but I don't think
it will in reality. For that, you would need users to beginning with
(joke aside, I like EHs). Anyway, if the order is enforced at sending,
why would any receiver accept a different order? In this case, being
liberal in what we accept might be a security risk (see below).
> The series as a whole is framed as a security improvement. Does
> enforcing order help with that?
IMHO, any packet with EHs in a different order than the one specified in
RFC8200 looks suspicious. So, yes.
Justin
>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>> ---
>> include/net/netns/ipv6.h | 1 +
>> include/net/protocol.h | 14 +++++++++++++
>> net/ipv6/af_inet6.c | 1 +
>> net/ipv6/exthdrs.c | 2 ++
>> net/ipv6/ip6_input.c | 42 ++++++++++++++++++++++++++++++++++++++
>> net/ipv6/reassembly.c | 1 +
>> net/ipv6/sysctl_net_ipv6.c | 7 +++++++
>> net/ipv6/xfrm6_protocol.c | 2 ++
>> 8 files changed, 70 insertions(+)
>>
>> diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
>> index 34bdb1308e8f..2db56718ea60 100644
>> --- a/include/net/netns/ipv6.h
>> +++ b/include/net/netns/ipv6.h
>> @@ -61,6 +61,7 @@ struct netns_sysctl_ipv6 {
>> u8 fib_notify_on_flag_change;
>> u8 icmpv6_error_anycast_as_unicast;
>> u8 icmpv6_errors_extension_mask;
>> + u8 enforce_ext_hdr_order;
>> };
>>
>> struct netns_ipv6 {
>> diff --git a/include/net/protocol.h b/include/net/protocol.h
>> index b2499f88f8f8..0f1676625570 100644
>> --- a/include/net/protocol.h
>> +++ b/include/net/protocol.h
>> @@ -50,6 +50,19 @@ struct net_protocol {
>> };
>>
>> #if IS_ENABLED(CONFIG_IPV6)
>> +
>> +/* Order of extension headers as prescribed in RFC8200. The ordering and
>> + * number of extension headers in a packet can be enforced in IPv6 receive
>> + * processing.
>> + */
>> +#define IPV6_EXT_HDR_ORDER_HOP BIT(0)
>> +#define IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH BIT(1)
>> +#define IPV6_EXT_HDR_ORDER_ROUTING BIT(2)
>> +#define IPV6_EXT_HDR_ORDER_FRAGMENT BIT(3)
>> +#define IPV6_EXT_HDR_ORDER_AUTH BIT(4)
>> +#define IPV6_EXT_HDR_ORDER_ESP BIT(5)
>> +#define IPV6_EXT_HDR_ORDER_DEST BIT(6)
>> +
>> struct inet6_protocol {
>> int (*handler)(struct sk_buff *skb);
>>
>> @@ -61,6 +74,7 @@ struct inet6_protocol {
>>
>> unsigned int flags; /* INET6_PROTO_xxx */
>> u32 secret;
>> + u32 ext_hdr_order;
>> };
>>
>> #define INET6_PROTO_NOPOLICY 0x1
>> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
>> index bd29840659f3..43097360ce64 100644
>> --- a/net/ipv6/af_inet6.c
>> +++ b/net/ipv6/af_inet6.c
>> @@ -980,6 +980,7 @@ static int __net_init inet6_net_init(struct net *net)
>> net->ipv6.sysctl.max_dst_opts_len = IP6_DEFAULT_MAX_DST_OPTS_LEN;
>> net->ipv6.sysctl.max_hbh_opts_len = IP6_DEFAULT_MAX_HBH_OPTS_LEN;
>> net->ipv6.sysctl.fib_notify_on_flag_change = 0;
>> + net->ipv6.sysctl.enforce_ext_hdr_order = 1;
>> atomic_set(&net->ipv6.fib6_sernum, 1);
>>
>> net->ipv6.sysctl.ioam6_id = IOAM6_DEFAULT_ID;
>> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
>> index 6925cfad94d2..4ab94c8cddb9 100644
>> --- a/net/ipv6/exthdrs.c
>> +++ b/net/ipv6/exthdrs.c
>> @@ -845,11 +845,13 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
>> static const struct inet6_protocol rthdr_protocol = {
>> .handler = ipv6_rthdr_rcv,
>> .flags = INET6_PROTO_NOPOLICY,
>> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ROUTING,
>> };
>>
>> static const struct inet6_protocol destopt_protocol = {
>> .handler = ipv6_destopt_rcv,
>> .flags = INET6_PROTO_NOPOLICY,
>> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_DEST,
>> };
>>
>> static const struct inet6_protocol nodata_protocol = {
>> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
>> index 168ec07e31cc..ab921c0a94af 100644
>> --- a/net/ipv6/ip6_input.c
>> +++ b/net/ipv6/ip6_input.c
>> @@ -355,6 +355,27 @@ void ipv6_list_rcv(struct list_head *head, struct packet_type *pt,
>> ip6_sublist_rcv(&sublist, curr_dev, curr_net);
>> }
>>
>> +static u32 check_dst_opts_before_rh(const struct inet6_protocol *ipprot,
>> + u32 ext_hdrs)
>> +{
>> + /* Check if Destination Options before the Routing Header are
>> + * present.
>> + */
>> + if (ipprot->ext_hdr_order != IPV6_EXT_HDR_ORDER_ROUTING ||
>> + !(ext_hdrs | IPV6_EXT_HDR_ORDER_DEST))
>> + return ext_hdrs;
>> +
>> + /* We have Destination Options before the Routing Header. Set
>> + * the mask of recived extension headers to reflect that. We promote
>> + * the bit from indicating just Destination Options present to
>> + * Destination Options before the Routing Header being present
>> + */
>> + ext_hdrs = (ext_hdrs & ~IPV6_EXT_HDR_ORDER_DEST) |
>> + IPV6_EXT_HDR_ORDER_DEST_BEFORE_RH;
>> +
>> + return ext_hdrs;
>> +}
>> +
>> INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
>>
>> /*
>> @@ -366,6 +387,7 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
>> const struct inet6_protocol *ipprot;
>> struct inet6_dev *idev;
>> unsigned int nhoff;
>> + u32 ext_hdrs = 0;
>> SKB_DR(reason);
>> bool raw;
>>
>> @@ -427,6 +449,26 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
>> goto discard;
>> }
>> }
>> +
>> + if (ipprot->ext_hdr_order &&
>> + READ_ONCE(net->ipv6.sysctl.enforce_ext_hdr_order)) {
>> + /* The protocol is an extension header and EH ordering
>> + * is being enforced. Discard packet if we've already
>> + * seen this EH or one that is lower in the order list
>> + */
>> + if (ipprot->ext_hdr_order <= ext_hdrs) {
>> + /* Check if there's Destination Options
>> + * before the Routing Header
>> + */
>> + ext_hdrs = check_dst_opts_before_rh(ipprot,
>> + ext_hdrs);
>> + if (ipprot->ext_hdr_order <= ext_hdrs)
>> + goto discard;
>> + }
>> +
>> + ext_hdrs |= ipprot->ext_hdr_order;
>> + }
>> +
>> if (!(ipprot->flags & INET6_PROTO_NOPOLICY)) {
>> if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
>> SKB_DR_SET(reason, XFRM_POLICY);
>> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
>> index 25ec8001898d..91dba72c5a3c 100644
>> --- a/net/ipv6/reassembly.c
>> +++ b/net/ipv6/reassembly.c
>> @@ -414,6 +414,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
>> static const struct inet6_protocol frag_protocol = {
>> .handler = ipv6_frag_rcv,
>> .flags = INET6_PROTO_NOPOLICY,
>> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_FRAGMENT,
>> };
>>
>> #ifdef CONFIG_SYSCTL
>> diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
>> index d2cd33e2698d..543b6acdb11d 100644
>> --- a/net/ipv6/sysctl_net_ipv6.c
>> +++ b/net/ipv6/sysctl_net_ipv6.c
>> @@ -213,6 +213,13 @@ static struct ctl_table ipv6_table_template[] = {
>> .proc_handler = proc_doulongvec_minmax,
>> .extra2 = &ioam6_id_wide_max,
>> },
>> + {
>> + .procname = "enforce_ext_hdr_order",
>> + .data = &init_net.ipv6.sysctl.enforce_ext_hdr_order,
>> + .maxlen = sizeof(u8),
>> + .mode = 0644,
>> + .proc_handler = proc_dou8vec_minmax,
>> + },
>> };
>>
>> static struct ctl_table ipv6_rotable[] = {
>> diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c
>> index ea2f805d3b01..5826edf67f64 100644
>> --- a/net/ipv6/xfrm6_protocol.c
>> +++ b/net/ipv6/xfrm6_protocol.c
>> @@ -197,12 +197,14 @@ static const struct inet6_protocol esp6_protocol = {
>> .handler = xfrm6_esp_rcv,
>> .err_handler = xfrm6_esp_err,
>> .flags = INET6_PROTO_NOPOLICY,
>> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_ESP,
>> };
>>
>> static const struct inet6_protocol ah6_protocol = {
>> .handler = xfrm6_ah_rcv,
>> .err_handler = xfrm6_ah_err,
>> .flags = INET6_PROTO_NOPOLICY,
>> + .ext_hdr_order = IPV6_EXT_HDR_ORDER_AUTH
>> };
>>
>> static const struct inet6_protocol ipcomp6_protocol = {
>> --
>> 2.43.0
>>
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-29 5:30 ` Willem de Bruijn
@ 2026-01-29 18:13 ` Justin Iurman
2026-01-29 19:01 ` Willem de Bruijn
2026-01-30 17:22 ` Tom Herbert
0 siblings, 2 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-29 18:13 UTC (permalink / raw)
To: Willem de Bruijn, Tom Herbert, davem, kuba, netdev
On 1/29/26 06:30, Willem de Bruijn wrote:
> Tom Herbert wrote:
>> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
>> of the TLV definitions. Label each of the TLV definitions as to whether
>> they are a Hop-by-Hop option, Destination option, or both.
>>
>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>> ---
>> include/uapi/linux/in6.h | 21 ++++++++++++++-------
>> include/uapi/linux/ip6_tunnel.h | 1 -
>> 2 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
>> index 5a47339ef7d7..438283dc5fde 100644
>> --- a/include/uapi/linux/in6.h
>> +++ b/include/uapi/linux/in6.h
>> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
>>
>> /*
>> * IPv6 TLV options.
>> + *
>> + * Hop-by-Hop and Destination options share the same number space.
>> + * For each option below whether it is a Hop-by-Hop option or
>> + * a Destination option is indicated by HBH or DestOpt.
>> */
>> -#define IPV6_TLV_PAD1 0
>> -#define IPV6_TLV_PADN 1
>> -#define IPV6_TLV_ROUTERALERT 5
>> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
>> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
>> -#define IPV6_TLV_JUMBO 194
>> -#define IPV6_TLV_HAO 201 /* home address option */
>> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
>> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
>> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
>> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
>> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
>> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
>> + * IOAM sent and rcvd as HBH
>
> Explicit labeling with HBH or Destopt is quite informative.
>
> Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
> in the Destopt branch? RFC 9486 indeed did reserve a number.
Nope, not right now. The only IOAM option currently implemented in the
kernel is the Pre-allocated Trace, which uses a Hop-by-Hop option. It
wouldn't make sense to have it in a Destination option, although you
could (i.e., it's not forbidden, just weird). Actually, the only IOAM
option that would make sense to carry in a Destination Option is the
Edge-to-Edge (E2E), but it's not implemented in the kernel. Should it be
implemented at some point, then yes, you'd have IPV6_TLV_IOAM in the
Destopt branch as well.
>> + */
>> +#define IPV6_TLV_JUMBO 194 /* HBH */
>> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
>>
>> /*
>> * IPV6 socket options
>> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
>> index 85182a839d42..35af4d9c35fb 100644
>> --- a/include/uapi/linux/ip6_tunnel.h
>> +++ b/include/uapi/linux/ip6_tunnel.h
>> @@ -6,7 +6,6 @@
>> #include <linux/if.h> /* For IFNAMSIZ. */
>> #include <linux/in6.h> /* For struct in6_addr. */
>>
>> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
>> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
>>
>> /* don't add encapsulation limit if one isn't present in inner packet */
>> --
>> 2.43.0
>>
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-29 18:13 ` Justin Iurman
@ 2026-01-29 19:01 ` Willem de Bruijn
2026-01-30 17:22 ` Tom Herbert
1 sibling, 0 replies; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-29 19:01 UTC (permalink / raw)
To: Justin Iurman, Willem de Bruijn, Tom Herbert, davem, kuba, netdev
Justin Iurman wrote:
> On 1/29/26 06:30, Willem de Bruijn wrote:
> > Tom Herbert wrote:
> >> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
> >> of the TLV definitions. Label each of the TLV definitions as to whether
> >> they are a Hop-by-Hop option, Destination option, or both.
> >>
> >> Signed-off-by: Tom Herbert <tom@herbertland.com>
> >> ---
> >> include/uapi/linux/in6.h | 21 ++++++++++++++-------
> >> include/uapi/linux/ip6_tunnel.h | 1 -
> >> 2 files changed, 14 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
> >> index 5a47339ef7d7..438283dc5fde 100644
> >> --- a/include/uapi/linux/in6.h
> >> +++ b/include/uapi/linux/in6.h
> >> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
> >>
> >> /*
> >> * IPv6 TLV options.
> >> + *
> >> + * Hop-by-Hop and Destination options share the same number space.
> >> + * For each option below whether it is a Hop-by-Hop option or
> >> + * a Destination option is indicated by HBH or DestOpt.
> >> */
> >> -#define IPV6_TLV_PAD1 0
> >> -#define IPV6_TLV_PADN 1
> >> -#define IPV6_TLV_ROUTERALERT 5
> >> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
> >> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
> >> -#define IPV6_TLV_JUMBO 194
> >> -#define IPV6_TLV_HAO 201 /* home address option */
> >> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
> >> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
> >> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
> >> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
> >> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
> >> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
> >> + * IOAM sent and rcvd as HBH
> >
> > Explicit labeling with HBH or Destopt is quite informative.
> >
> > Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
> > in the Destopt branch? RFC 9486 indeed did reserve a number.
>
> Nope, not right now. The only IOAM option currently implemented in the
> kernel is the Pre-allocated Trace, which uses a Hop-by-Hop option. It
> wouldn't make sense to have it in a Destination option, although you
> could (i.e., it's not forbidden, just weird). Actually, the only IOAM
> option that would make sense to carry in a Destination Option is the
> Edge-to-Edge (E2E), but it's not implemented in the kernel. Should it be
> implemented at some point, then yes, you'd have IPV6_TLV_IOAM in the
> Destopt branch as well.
Sounds great. Thanks for that context.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-29 18:07 ` Justin Iurman
@ 2026-01-29 19:05 ` Willem de Bruijn
2026-01-29 20:13 ` Justin Iurman
2026-01-30 17:06 ` Tom Herbert
0 siblings, 2 replies; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-29 19:05 UTC (permalink / raw)
To: Justin Iurman, Willem de Bruijn, Tom Herbert, davem, kuba, netdev
Justin Iurman wrote:
> On 1/29/26 06:18, Willem de Bruijn wrote:
> > Tom Herbert wrote:
> >> RFC8200 highly recommends that different Extension Headers be send in
> >> a prescibed order and all Extension Header types occur at most once
> >> in a packet with the exception of Destination Options that may
> >> occur twice. This patch enforces the ordering be folowed in received
> >> packets.
> >>
> >> The allowed order of Extension Headers is:
> >>
> >> IPv6 header
> >> Hop-by-Hop Options header
> >> Destination Options before the Routing Header
> >> Routing header
> >> Fragment header
> >> Authentication header
> >> Encapsulating Security Payload header
> >> Destination Options header
> >> Upper-Layer header
> >>
> >> Each Extension Header may be present only once in a packet.
> >>
> >> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> >> enforcement of xtension Header order. If it is set to zero then
> >
> > [e]xtension. There are a few more typos in the various commit
> > messages.
> >
> >> Extension Header order and number of occurences is not checked
> >> in receive processeing (except for Hop-by-Hop Options that
> >> must be the first Extension Header and can only occur once in
> >> a packet.
> >
> > RFC 8200 also states
> >
> > "IPv6 nodes must accept and attempt to process extension headers in
> > any order and occurring any number of times in the same packet,
> > except for the Hop-by-Hop Options header, which is restricted to
> > appear immediately after an IPv6 header only. Nonetheless, it is
> > strongly advised that sources of IPv6 packets adhere to the above
> > recommended order until and unless subsequent specifications revise
> > that recommendation."
> >
> > A case of be strict in what you send, liberal in what you accept.
> >
> > This new sysctl has a chance of breaking existing users.
>
> Willem,
>
> Note that RFC8200 does not use normative language, which is part of the
> problem. It could theoretically break existing users, but I don't think
> it will in reality. For that, you would need users to beginning with
> (joke aside, I like EHs). Anyway, if the order is enforced at sending,
> why would any receiver accept a different order? In this case, being
> liberal in what we accept might be a security risk (see below).
>
> > The series as a whole is framed as a security improvement. Does
> > enforcing order help with that?
>
> IMHO, any packet with EHs in a different order than the one specified in
> RFC8200 looks suspicious. So, yes.
Looks suspicious. But does not introduce concrete new risks?
The main risk I understand around IPv6 extension headers is the risk
common to all untrusted network input: bugs in parsing code. Bugs can
cause crashes, infinite loops, or worse subtle effects. This is why we
introduced the BPF flow dissector, for instance.
I don't immediately see how different order of headers increases
parsing risk. Nor, btw, that reducing max number of headers from 8 to
2 significantly mitigates a real risk.
No objections necessarily. But I don't fully understand the argument.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-29 19:05 ` Willem de Bruijn
@ 2026-01-29 20:13 ` Justin Iurman
2026-01-30 17:06 ` Tom Herbert
1 sibling, 0 replies; 30+ messages in thread
From: Justin Iurman @ 2026-01-29 20:13 UTC (permalink / raw)
To: Willem de Bruijn, Tom Herbert, davem, kuba, netdev
On 1/29/26 20:05, Willem de Bruijn wrote:
> Justin Iurman wrote:
>> On 1/29/26 06:18, Willem de Bruijn wrote:
>>> Tom Herbert wrote:
>>>> RFC8200 highly recommends that different Extension Headers be send in
>>>> a prescibed order and all Extension Header types occur at most once
>>>> in a packet with the exception of Destination Options that may
>>>> occur twice. This patch enforces the ordering be folowed in received
>>>> packets.
>>>>
>>>> The allowed order of Extension Headers is:
>>>>
>>>> IPv6 header
>>>> Hop-by-Hop Options header
>>>> Destination Options before the Routing Header
>>>> Routing header
>>>> Fragment header
>>>> Authentication header
>>>> Encapsulating Security Payload header
>>>> Destination Options header
>>>> Upper-Layer header
>>>>
>>>> Each Extension Header may be present only once in a packet.
>>>>
>>>> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
>>>> enforcement of xtension Header order. If it is set to zero then
>>>
>>> [e]xtension. There are a few more typos in the various commit
>>> messages.
>>>
>>>> Extension Header order and number of occurences is not checked
>>>> in receive processeing (except for Hop-by-Hop Options that
>>>> must be the first Extension Header and can only occur once in
>>>> a packet.
>>>
>>> RFC 8200 also states
>>>
>>> "IPv6 nodes must accept and attempt to process extension headers in
>>> any order and occurring any number of times in the same packet,
>>> except for the Hop-by-Hop Options header, which is restricted to
>>> appear immediately after an IPv6 header only. Nonetheless, it is
>>> strongly advised that sources of IPv6 packets adhere to the above
>>> recommended order until and unless subsequent specifications revise
>>> that recommendation."
>>>
>>> A case of be strict in what you send, liberal in what you accept.
>>>
>>> This new sysctl has a chance of breaking existing users.
>>
>> Willem,
>>
>> Note that RFC8200 does not use normative language, which is part of the
>> problem. It could theoretically break existing users, but I don't think
>> it will in reality. For that, you would need users to beginning with
>> (joke aside, I like EHs). Anyway, if the order is enforced at sending,
>> why would any receiver accept a different order? In this case, being
>> liberal in what we accept might be a security risk (see below).
>>
>>> The series as a whole is framed as a security improvement. Does
>>> enforcing order help with that?
>>
>> IMHO, any packet with EHs in a different order than the one specified in
>> RFC8200 looks suspicious. So, yes.
>
> Looks suspicious. But does not introduce concrete new risks?
>
> The main risk I understand around IPv6 extension headers is the risk
> common to all untrusted network input: bugs in parsing code. Bugs can
> cause crashes, infinite loops, or worse subtle effects. This is why we
> introduced the BPF flow dissector, for instance.
>
> I don't immediately see how different order of headers increases
> parsing risk. Nor, btw, that reducing max number of headers from 8 to
> 2 significantly mitigates a real risk.
>
> No objections necessarily. But I don't fully understand the argument.
Enforcing the order kills two birds with one stone as it also allows to
automatically limit the number of DestOpts in general. This is a gap
introduced by RFC8200, where you "must" accept EHs in any order and
occurring any number of times in the same packet (even though it says
that there might be two DestOpts max). The wording is ambiguous and
non-normative. Since the DestOpt can contain a certain amount of
options, then you theoretically end up with a nice attack vector.
Regarding why "reducing max number of headers from 8 to 2 significantly
mitigates a real risk", it's not about the number of EHs, here this is
about the number of options inside a Hop-by-Hop or Destination Option
header specifically. Therefore, you further reduce the attack vector.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-29 19:05 ` Willem de Bruijn
2026-01-29 20:13 ` Justin Iurman
@ 2026-01-30 17:06 ` Tom Herbert
2026-01-31 17:24 ` Willem de Bruijn
1 sibling, 1 reply; 30+ messages in thread
From: Tom Herbert @ 2026-01-30 17:06 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: Justin Iurman, davem, kuba, netdev
On Thu, Jan 29, 2026 at 11:05 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Justin Iurman wrote:
> > On 1/29/26 06:18, Willem de Bruijn wrote:
> > > Tom Herbert wrote:
> > >> RFC8200 highly recommends that different Extension Headers be send in
> > >> a prescibed order and all Extension Header types occur at most once
> > >> in a packet with the exception of Destination Options that may
> > >> occur twice. This patch enforces the ordering be folowed in received
> > >> packets.
> > >>
> > >> The allowed order of Extension Headers is:
> > >>
> > >> IPv6 header
> > >> Hop-by-Hop Options header
> > >> Destination Options before the Routing Header
> > >> Routing header
> > >> Fragment header
> > >> Authentication header
> > >> Encapsulating Security Payload header
> > >> Destination Options header
> > >> Upper-Layer header
> > >>
> > >> Each Extension Header may be present only once in a packet.
> > >>
> > >> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> > >> enforcement of xtension Header order. If it is set to zero then
> > >
> > > [e]xtension. There are a few more typos in the various commit
> > > messages.
> > >
> > >> Extension Header order and number of occurences is not checked
> > >> in receive processeing (except for Hop-by-Hop Options that
> > >> must be the first Extension Header and can only occur once in
> > >> a packet.
> > >
> > > RFC 8200 also states
> > >
> > > "IPv6 nodes must accept and attempt to process extension headers in
> > > any order and occurring any number of times in the same packet,
> > > except for the Hop-by-Hop Options header, which is restricted to
> > > appear immediately after an IPv6 header only. Nonetheless, it is
> > > strongly advised that sources of IPv6 packets adhere to the above
> > > recommended order until and unless subsequent specifications revise
> > > that recommendation."
> > >
> > > A case of be strict in what you send, liberal in what you accept.
> > >
> > > This new sysctl has a chance of breaking existing users.
> >
> > Willem,
> >
> > Note that RFC8200 does not use normative language, which is part of the
> > problem. It could theoretically break existing users, but I don't think
> > it will in reality. For that, you would need users to beginning with
> > (joke aside, I like EHs). Anyway, if the order is enforced at sending,
> > why would any receiver accept a different order? In this case, being
> > liberal in what we accept might be a security risk (see below).
> >
> > > The series as a whole is framed as a security improvement. Does
> > > enforcing order help with that?
> >
> > IMHO, any packet with EHs in a different order than the one specified in
> > RFC8200 looks suspicious. So, yes.
>
Hi Willem, thanks for your comments!
> Looks suspicious. But does not introduce concrete new risks?
Hard to say specifically. On the other hand, there's no known use
cases for alternatives for it and given all the other security perils
a strong default security posture wrt EH order seems prudent.
>
> The main risk I understand around IPv6 extension headers is the risk
> common to all untrusted network input: bugs in parsing code. Bugs can
> cause crashes, infinite loops, or worse subtle effects. This is why we
> introduced the BPF flow dissector, for instance.
I believe the main risk is Denial of Service attacks and security
vulnerabilities.
>
> I don't immediately see how different order of headers increases
> parsing risk. Nor, btw, that reducing max number of headers from 8 to
> 2 significantly mitigates a real risk.
It's a similar rationale to putting a limit on the number of options
in the first place. Going from 700 options allowed in a packet to at
most 8 allowed was a no brainer. But even 8 is too much considering
that the stack only supports three Hop-by-Hop options and two
Destination options. There's a common misnomer that only hardware has
trouble parsing TLVs, and somehow it's free in SW (I've been battling
that mindset!). For instance, a well constructed attack could force
one cache, maybe two cache misses per each option. So going from 8 to
2 as the default limit could materially mitigate the damage for a DoS
attack on options.
>
> No objections necessarily. But I don't fully understand the argument.
We selected the default value of eight in RFC8504 based on an
expectation that there might be new options defined and that the
Internet would be fixed to reliably support extension headers
including those with options. I do not believe either of those are
going to happen.
Hop-by-Hop Options are ostensibly the right way to do network to host
and host to network signaling.The only HBH options that might get any
substantial deployment are Router Alert option and IOAM. The Router
Alert option is being deprecated and IOAM is at best a "nice toi
have". The best use case of Hop-by-Hop options is congestion
signaling, unfortunately the die was cast when CSIG authors decided to
place the information in VLANs at L2 and cajole the information to be
routable through a switch. IMO, the miss on CSIG pretty much is the
nail in the coffin for Hop-by-Hop options to ever be widely deployed
(https://www.ietf.org/archive/id/draft-ravi-ippm-csig-01.txt).
Destination Options have proven even less useful than Hop-by-Hop
Options. The only Destination Option supported by the stack is the
Tunnel Encap Limit option and Home Address Options. The Tunnel Encap
Option was buried in the v6 tunnel code which is why it wasn't obvious
it was supported in the first version of the patch set. I'll assume
this might be useful, so this patch set cleans up the code for it. I
don't believe there's any use of Home Address Option.
A major problem with DestOpts, HBH, Routing Header, and Fragment
header is that they have no inherent security. Their use presents a
security risk especially when sent over untrusted networks including
the Internet. Given that and that the high drop rates of extension
headers on the Internet, I am proposing that Extension header except
fo ESP being deprecated on the Internet
(https://www.ietf.org/archive/id/draft-herbert-deprecate-eh-01.txt).
IMO, IPv6 Extension Headers are a failed experiment in protocol
design. The vast majority of hosts do not care about them, and the
best use case for them is Denial of Service attack. So IMO the greater
good is to limit them as much as possible. If some private network
finds a use case for them they can also configure sysctls if they
don't like the default behavior.
Tom
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-29 18:13 ` Justin Iurman
2026-01-29 19:01 ` Willem de Bruijn
@ 2026-01-30 17:22 ` Tom Herbert
2026-02-01 8:48 ` Justin Iurman
1 sibling, 1 reply; 30+ messages in thread
From: Tom Herbert @ 2026-01-30 17:22 UTC (permalink / raw)
To: Justin Iurman; +Cc: Willem de Bruijn, davem, kuba, netdev
On Thu, Jan 29, 2026 at 10:13 AM Justin Iurman <justin.iurman@gmail.com> wrote:
>
> On 1/29/26 06:30, Willem de Bruijn wrote:
> > Tom Herbert wrote:
> >> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
> >> of the TLV definitions. Label each of the TLV definitions as to whether
> >> they are a Hop-by-Hop option, Destination option, or both.
> >>
> >> Signed-off-by: Tom Herbert <tom@herbertland.com>
> >> ---
> >> include/uapi/linux/in6.h | 21 ++++++++++++++-------
> >> include/uapi/linux/ip6_tunnel.h | 1 -
> >> 2 files changed, 14 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
> >> index 5a47339ef7d7..438283dc5fde 100644
> >> --- a/include/uapi/linux/in6.h
> >> +++ b/include/uapi/linux/in6.h
> >> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
> >>
> >> /*
> >> * IPv6 TLV options.
> >> + *
> >> + * Hop-by-Hop and Destination options share the same number space.
> >> + * For each option below whether it is a Hop-by-Hop option or
> >> + * a Destination option is indicated by HBH or DestOpt.
> >> */
> >> -#define IPV6_TLV_PAD1 0
> >> -#define IPV6_TLV_PADN 1
> >> -#define IPV6_TLV_ROUTERALERT 5
> >> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
> >> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
> >> -#define IPV6_TLV_JUMBO 194
> >> -#define IPV6_TLV_HAO 201 /* home address option */
> >> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
> >> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
> >> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
> >> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
> >> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
> >> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
> >> + * IOAM sent and rcvd as HBH
> >
> > Explicit labeling with HBH or Destopt is quite informative.
> >
> > Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
> > in the Destopt branch? RFC 9486 indeed did reserve a number.
>
> Nope, not right now. The only IOAM option currently implemented in the
> kernel is the Pre-allocated Trace, which uses a Hop-by-Hop option. It
> wouldn't make sense to have it in a Destination option, although you
> could (i.e., it's not forbidden, just weird). Actually, the only IOAM
> option that would make sense to carry in a Destination Option is the
> Edge-to-Edge (E2E), but it's not implemented in the kernel. Should it be
> implemented at some point, then yes, you'd have IPV6_TLV_IOAM in the
> Destopt branch as well.
Justin,
Conceptually, someone could put IOAM in Destination Options before the
Routing Header. There's about 0% of that ever happening though.
Tom
>
> >> + */
> >> +#define IPV6_TLV_JUMBO 194 /* HBH */
> >> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
> >>
> >> /*
> >> * IPV6 socket options
> >> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
> >> index 85182a839d42..35af4d9c35fb 100644
> >> --- a/include/uapi/linux/ip6_tunnel.h
> >> +++ b/include/uapi/linux/ip6_tunnel.h
> >> @@ -6,7 +6,6 @@
> >> #include <linux/if.h> /* For IFNAMSIZ. */
> >> #include <linux/in6.h> /* For struct in6_addr. */
> >>
> >> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
> >> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
> >>
> >> /* don't add encapsulation limit if one isn't present in inner packet */
> >> --
> >> 2.43.0
> >>
> >
> >
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-30 17:06 ` Tom Herbert
@ 2026-01-31 17:24 ` Willem de Bruijn
2026-02-02 22:21 ` Tom Herbert
0 siblings, 1 reply; 30+ messages in thread
From: Willem de Bruijn @ 2026-01-31 17:24 UTC (permalink / raw)
To: Tom Herbert, Willem de Bruijn; +Cc: Justin Iurman, davem, kuba, netdev
Tom Herbert wrote:
> On Thu, Jan 29, 2026 at 11:05 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > Justin Iurman wrote:
> > > On 1/29/26 06:18, Willem de Bruijn wrote:
> > > > Tom Herbert wrote:
> > > >> RFC8200 highly recommends that different Extension Headers be send in
> > > >> a prescibed order and all Extension Header types occur at most once
> > > >> in a packet with the exception of Destination Options that may
> > > >> occur twice. This patch enforces the ordering be folowed in received
> > > >> packets.
> > > >>
> > > >> The allowed order of Extension Headers is:
> > > >>
> > > >> IPv6 header
> > > >> Hop-by-Hop Options header
> > > >> Destination Options before the Routing Header
> > > >> Routing header
> > > >> Fragment header
> > > >> Authentication header
> > > >> Encapsulating Security Payload header
> > > >> Destination Options header
> > > >> Upper-Layer header
> > > >>
> > > >> Each Extension Header may be present only once in a packet.
> > > >>
> > > >> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> > > >> enforcement of xtension Header order. If it is set to zero then
> > > >
> > > > [e]xtension. There are a few more typos in the various commit
> > > > messages.
> > > >
> > > >> Extension Header order and number of occurences is not checked
> > > >> in receive processeing (except for Hop-by-Hop Options that
> > > >> must be the first Extension Header and can only occur once in
> > > >> a packet.
> > > >
> > > > RFC 8200 also states
> > > >
> > > > "IPv6 nodes must accept and attempt to process extension headers in
> > > > any order and occurring any number of times in the same packet,
> > > > except for the Hop-by-Hop Options header, which is restricted to
> > > > appear immediately after an IPv6 header only. Nonetheless, it is
> > > > strongly advised that sources of IPv6 packets adhere to the above
> > > > recommended order until and unless subsequent specifications revise
> > > > that recommendation."
> > > >
> > > > A case of be strict in what you send, liberal in what you accept.
> > > >
> > > > This new sysctl has a chance of breaking existing users.
> > >
> > > Willem,
> > >
> > > Note that RFC8200 does not use normative language, which is part of the
> > > problem. It could theoretically break existing users, but I don't think
> > > it will in reality. For that, you would need users to beginning with
> > > (joke aside, I like EHs). Anyway, if the order is enforced at sending,
> > > why would any receiver accept a different order? In this case, being
> > > liberal in what we accept might be a security risk (see below).
> > >
> > > > The series as a whole is framed as a security improvement. Does
> > > > enforcing order help with that?
> > >
> > > IMHO, any packet with EHs in a different order than the one specified in
> > > RFC8200 looks suspicious. So, yes.
> >
>
> Hi Willem, thanks for your comments!
>
> > Looks suspicious. But does not introduce concrete new risks?
>
> Hard to say specifically. On the other hand, there's no known use
> cases for alternatives for it and given all the other security perils
> a strong default security posture wrt EH order seems prudent.
>
> >
> > The main risk I understand around IPv6 extension headers is the risk
> > common to all untrusted network input: bugs in parsing code. Bugs can
> > cause crashes, infinite loops, or worse subtle effects. This is why we
> > introduced the BPF flow dissector, for instance.
>
> I believe the main risk is Denial of Service attacks and security
> vulnerabilities.
>
> >
> > I don't immediately see how different order of headers increases
> > parsing risk. Nor, btw, that reducing max number of headers from 8 to
> > 2 significantly mitigates a real risk.
>
> It's a similar rationale to putting a limit on the number of options
> in the first place. Going from 700 options allowed in a packet to at
> most 8 allowed was a no brainer. But even 8 is too much considering
> that the stack only supports three Hop-by-Hop options and two
> Destination options. There's a common misnomer that only hardware has
> trouble parsing TLVs, and somehow it's free in SW (I've been battling
> that mindset!). For instance, a well constructed attack could force
> one cache, maybe two cache misses per each option. So going from 8 to
> 2 as the default limit could materially mitigate the damage for a DoS
> attack on options.
>
> >
> > No objections necessarily. But I don't fully understand the argument.
>
> We selected the default value of eight in RFC8504 based on an
> expectation that there might be new options defined and that the
> Internet would be fixed to reliably support extension headers
> including those with options. I do not believe either of those are
> going to happen.
>
> Hop-by-Hop Options are ostensibly the right way to do network to host
> and host to network signaling.The only HBH options that might get any
> substantial deployment are Router Alert option and IOAM. The Router
> Alert option is being deprecated and IOAM is at best a "nice toi
> have". The best use case of Hop-by-Hop options is congestion
> signaling, unfortunately the die was cast when CSIG authors decided to
> place the information in VLANs at L2 and cajole the information to be
> routable through a switch. IMO, the miss on CSIG pretty much is the
> nail in the coffin for Hop-by-Hop options to ever be widely deployed
> (https://www.ietf.org/archive/id/draft-ravi-ippm-csig-01.txt).
>
> Destination Options have proven even less useful than Hop-by-Hop
> Options. The only Destination Option supported by the stack is the
> Tunnel Encap Limit option and Home Address Options. The Tunnel Encap
> Option was buried in the v6 tunnel code which is why it wasn't obvious
> it was supported in the first version of the patch set. I'll assume
> this might be useful, so this patch set cleans up the code for it. I
> don't believe there's any use of Home Address Option.
>
> A major problem with DestOpts, HBH, Routing Header, and Fragment
> header is that they have no inherent security. Their use presents a
> security risk especially when sent over untrusted networks including
> the Internet. Given that and that the high drop rates of extension
> headers on the Internet, I am proposing that Extension header except
> fo ESP being deprecated on the Internet
> (https://www.ietf.org/archive/id/draft-herbert-deprecate-eh-01.txt).
>
> IMO, IPv6 Extension Headers are a failed experiment in protocol
> design. The vast majority of hosts do not care about them, and the
> best use case for them is Denial of Service attack. So IMO the greater
> good is to limit them as much as possible. If some private network
> finds a use case for them they can also configure sysctls if they
> don't like the default behavior.
Thanks for the detailed context, Tom. May be good to fold into the
cover letter or commit message if respinning.
One remaining question: these features may be largely unused by
Linux systems, and Linux peers can be expected to follow RFC
directives on ordering.
But may there be other peers in the wild that are not necessarily
malicious, but just less strict. IOW could this break legitimate
users in the long tail of use cases of Linux in the wild?
Probably good to explicitly state if we are not aware of any.
Or abuses of extension headers for private reasons inside non-public
networks.
The risk/reward of lowering from 8 to 2 to me offers at best a small
real increase in security, but it may have real regression risk in odd
installations?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-01-30 17:22 ` Tom Herbert
@ 2026-02-01 8:48 ` Justin Iurman
2026-02-02 22:37 ` Tom Herbert
0 siblings, 1 reply; 30+ messages in thread
From: Justin Iurman @ 2026-02-01 8:48 UTC (permalink / raw)
To: Tom Herbert; +Cc: Willem de Bruijn, davem, kuba, netdev
On 1/30/26 18:22, Tom Herbert wrote:
> On Thu, Jan 29, 2026 at 10:13 AM Justin Iurman <justin.iurman@gmail.com> wrote:
>>
>> On 1/29/26 06:30, Willem de Bruijn wrote:
>>> Tom Herbert wrote:
>>>> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
>>>> of the TLV definitions. Label each of the TLV definitions as to whether
>>>> they are a Hop-by-Hop option, Destination option, or both.
>>>>
>>>> Signed-off-by: Tom Herbert <tom@herbertland.com>
>>>> ---
>>>> include/uapi/linux/in6.h | 21 ++++++++++++++-------
>>>> include/uapi/linux/ip6_tunnel.h | 1 -
>>>> 2 files changed, 14 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
>>>> index 5a47339ef7d7..438283dc5fde 100644
>>>> --- a/include/uapi/linux/in6.h
>>>> +++ b/include/uapi/linux/in6.h
>>>> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
>>>>
>>>> /*
>>>> * IPv6 TLV options.
>>>> + *
>>>> + * Hop-by-Hop and Destination options share the same number space.
>>>> + * For each option below whether it is a Hop-by-Hop option or
>>>> + * a Destination option is indicated by HBH or DestOpt.
>>>> */
>>>> -#define IPV6_TLV_PAD1 0
>>>> -#define IPV6_TLV_PADN 1
>>>> -#define IPV6_TLV_ROUTERALERT 5
>>>> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
>>>> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
>>>> -#define IPV6_TLV_JUMBO 194
>>>> -#define IPV6_TLV_HAO 201 /* home address option */
>>>> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
>>>> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
>>>> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
>>>> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
>>>> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
>>>> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
>>>> + * IOAM sent and rcvd as HBH
>>>
>>> Explicit labeling with HBH or Destopt is quite informative.
>>>
>>> Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
>>> in the Destopt branch? RFC 9486 indeed did reserve a number.
>>
>> Nope, not right now. The only IOAM option currently implemented in the
>> kernel is the Pre-allocated Trace, which uses a Hop-by-Hop option. It
>> wouldn't make sense to have it in a Destination option, although you
>> could (i.e., it's not forbidden, just weird). Actually, the only IOAM
>> option that would make sense to carry in a Destination Option is the
>> Edge-to-Edge (E2E), but it's not implemented in the kernel. Should it be
>> implemented at some point, then yes, you'd have IPV6_TLV_IOAM in the
>> Destopt branch as well.
>
> Justin,
>
> Conceptually, someone could put IOAM in Destination Options before the
> Routing Header. There's about 0% of that ever happening though.
Tom,
Correct. However, I wouldn't say there's about 0% of that ever
happening. At some point, I remember that we even thought about using
the IOAM Pre-allocated Trace in the Destination Options header (the
first one, before the RH). The goal was to use it with SRv6 and collect
telemetry for the overlay. There was also an attempt at including IOAM
within SRv6 directly (draft-ali-spring-ioam-srv6), which didn't get
consensus at that time.
Justin
> Tom
>
>>
>>>> + */
>>>> +#define IPV6_TLV_JUMBO 194 /* HBH */
>>>> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
>>>>
>>>> /*
>>>> * IPV6 socket options
>>>> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
>>>> index 85182a839d42..35af4d9c35fb 100644
>>>> --- a/include/uapi/linux/ip6_tunnel.h
>>>> +++ b/include/uapi/linux/ip6_tunnel.h
>>>> @@ -6,7 +6,6 @@
>>>> #include <linux/if.h> /* For IFNAMSIZ. */
>>>> #include <linux/in6.h> /* For struct in6_addr. */
>>>>
>>>> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
>>>> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
>>>>
>>>> /* don't add encapsulation limit if one isn't present in inner packet */
>>>> --
>>>> 2.43.0
>>>>
>>>
>>>
>>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering
2026-01-31 17:24 ` Willem de Bruijn
@ 2026-02-02 22:21 ` Tom Herbert
0 siblings, 0 replies; 30+ messages in thread
From: Tom Herbert @ 2026-02-02 22:21 UTC (permalink / raw)
To: Willem de Bruijn; +Cc: Justin Iurman, davem, kuba, netdev
On Sat, Jan 31, 2026 at 9:24 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Tom Herbert wrote:
> > On Thu, Jan 29, 2026 at 11:05 AM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > >
> > > Justin Iurman wrote:
> > > > On 1/29/26 06:18, Willem de Bruijn wrote:
> > > > > Tom Herbert wrote:
> > > > >> RFC8200 highly recommends that different Extension Headers be send in
> > > > >> a prescibed order and all Extension Header types occur at most once
> > > > >> in a packet with the exception of Destination Options that may
> > > > >> occur twice. This patch enforces the ordering be folowed in received
> > > > >> packets.
> > > > >>
> > > > >> The allowed order of Extension Headers is:
> > > > >>
> > > > >> IPv6 header
> > > > >> Hop-by-Hop Options header
> > > > >> Destination Options before the Routing Header
> > > > >> Routing header
> > > > >> Fragment header
> > > > >> Authentication header
> > > > >> Encapsulating Security Payload header
> > > > >> Destination Options header
> > > > >> Upper-Layer header
> > > > >>
> > > > >> Each Extension Header may be present only once in a packet.
> > > > >>
> > > > >> net.ipv6.enforce_ext_hdr_order is a sysctl to enable or disable
> > > > >> enforcement of xtension Header order. If it is set to zero then
> > > > >
> > > > > [e]xtension. There are a few more typos in the various commit
> > > > > messages.
> > > > >
> > > > >> Extension Header order and number of occurences is not checked
> > > > >> in receive processeing (except for Hop-by-Hop Options that
> > > > >> must be the first Extension Header and can only occur once in
> > > > >> a packet.
> > > > >
> > > > > RFC 8200 also states
> > > > >
> > > > > "IPv6 nodes must accept and attempt to process extension headers in
> > > > > any order and occurring any number of times in the same packet,
> > > > > except for the Hop-by-Hop Options header, which is restricted to
> > > > > appear immediately after an IPv6 header only. Nonetheless, it is
> > > > > strongly advised that sources of IPv6 packets adhere to the above
> > > > > recommended order until and unless subsequent specifications revise
> > > > > that recommendation."
> > > > >
> > > > > A case of be strict in what you send, liberal in what you accept.
> > > > >
> > > > > This new sysctl has a chance of breaking existing users.
> > > >
> > > > Willem,
> > > >
> > > > Note that RFC8200 does not use normative language, which is part of the
> > > > problem. It could theoretically break existing users, but I don't think
> > > > it will in reality. For that, you would need users to beginning with
> > > > (joke aside, I like EHs). Anyway, if the order is enforced at sending,
> > > > why would any receiver accept a different order? In this case, being
> > > > liberal in what we accept might be a security risk (see below).
> > > >
> > > > > The series as a whole is framed as a security improvement. Does
> > > > > enforcing order help with that?
> > > >
> > > > IMHO, any packet with EHs in a different order than the one specified in
> > > > RFC8200 looks suspicious. So, yes.
> > >
> >
> > Hi Willem, thanks for your comments!
> >
> > > Looks suspicious. But does not introduce concrete new risks?
> >
> > Hard to say specifically. On the other hand, there's no known use
> > cases for alternatives for it and given all the other security perils
> > a strong default security posture wrt EH order seems prudent.
> >
> > >
> > > The main risk I understand around IPv6 extension headers is the risk
> > > common to all untrusted network input: bugs in parsing code. Bugs can
> > > cause crashes, infinite loops, or worse subtle effects. This is why we
> > > introduced the BPF flow dissector, for instance.
> >
> > I believe the main risk is Denial of Service attacks and security
> > vulnerabilities.
> >
> > >
> > > I don't immediately see how different order of headers increases
> > > parsing risk. Nor, btw, that reducing max number of headers from 8 to
> > > 2 significantly mitigates a real risk.
> >
> > It's a similar rationale to putting a limit on the number of options
> > in the first place. Going from 700 options allowed in a packet to at
> > most 8 allowed was a no brainer. But even 8 is too much considering
> > that the stack only supports three Hop-by-Hop options and two
> > Destination options. There's a common misnomer that only hardware has
> > trouble parsing TLVs, and somehow it's free in SW (I've been battling
> > that mindset!). For instance, a well constructed attack could force
> > one cache, maybe two cache misses per each option. So going from 8 to
> > 2 as the default limit could materially mitigate the damage for a DoS
> > attack on options.
> >
> > >
> > > No objections necessarily. But I don't fully understand the argument.
> >
> > We selected the default value of eight in RFC8504 based on an
> > expectation that there might be new options defined and that the
> > Internet would be fixed to reliably support extension headers
> > including those with options. I do not believe either of those are
> > going to happen.
> >
> > Hop-by-Hop Options are ostensibly the right way to do network to host
> > and host to network signaling.The only HBH options that might get any
> > substantial deployment are Router Alert option and IOAM. The Router
> > Alert option is being deprecated and IOAM is at best a "nice toi
> > have". The best use case of Hop-by-Hop options is congestion
> > signaling, unfortunately the die was cast when CSIG authors decided to
> > place the information in VLANs at L2 and cajole the information to be
> > routable through a switch. IMO, the miss on CSIG pretty much is the
> > nail in the coffin for Hop-by-Hop options to ever be widely deployed
> > (https://www.ietf.org/archive/id/draft-ravi-ippm-csig-01.txt).
> >
> > Destination Options have proven even less useful than Hop-by-Hop
> > Options. The only Destination Option supported by the stack is the
> > Tunnel Encap Limit option and Home Address Options. The Tunnel Encap
> > Option was buried in the v6 tunnel code which is why it wasn't obvious
> > it was supported in the first version of the patch set. I'll assume
> > this might be useful, so this patch set cleans up the code for it. I
> > don't believe there's any use of Home Address Option.
> >
> > A major problem with DestOpts, HBH, Routing Header, and Fragment
> > header is that they have no inherent security. Their use presents a
> > security risk especially when sent over untrusted networks including
> > the Internet. Given that and that the high drop rates of extension
> > headers on the Internet, I am proposing that Extension header except
> > fo ESP being deprecated on the Internet
> > (https://www.ietf.org/archive/id/draft-herbert-deprecate-eh-01.txt).
> >
> > IMO, IPv6 Extension Headers are a failed experiment in protocol
> > design. The vast majority of hosts do not care about them, and the
> > best use case for them is Denial of Service attack. So IMO the greater
> > good is to limit them as much as possible. If some private network
> > finds a use case for them they can also configure sysctls if they
> > don't like the default behavior.
>
> Thanks for the detailed context, Tom. May be good to fold into the
> cover letter or commit message if respinning.
Will do.
>
> One remaining question: these features may be largely unused by
> Linux systems, and Linux peers can be expected to follow RFC
> directives on ordering.
>
> But may there be other peers in the wild that are not necessarily
> malicious, but just less strict. IOW could this break legitimate
> users in the long tail of use cases of Linux in the wild?
It's exceedingly unlikely to be a problem, and if it is for someone
they can set the sysctl. I would also point out that the stack already
disallows having more than one fragment header.
> Probably good to explicitly state if we are not aware of any.
>
> Or abuses of extension headers for private reasons inside non-public
> networks.
>
> The risk/reward of lowering from 8 to 2 to me offers at best a small
> real increase in security, but it may have real regression risk in odd
> installations?
I think it's very unlikely to cause any issues. We reduced the number
from infinity to eight without any complaints.
Tom
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions
2026-02-01 8:48 ` Justin Iurman
@ 2026-02-02 22:37 ` Tom Herbert
0 siblings, 0 replies; 30+ messages in thread
From: Tom Herbert @ 2026-02-02 22:37 UTC (permalink / raw)
To: Justin Iurman; +Cc: Willem de Bruijn, davem, kuba, netdev
On Sun, Feb 1, 2026 at 12:48 AM Justin Iurman <justin.iurman@gmail.com> wrote:
>
> On 1/30/26 18:22, Tom Herbert wrote:
> > On Thu, Jan 29, 2026 at 10:13 AM Justin Iurman <justin.iurman@gmail.com> wrote:
> >>
> >> On 1/29/26 06:30, Willem de Bruijn wrote:
> >>> Tom Herbert wrote:
> >>>> Move IPV6_TLV_TNL_ENCAP_LIMIT to uapi/linux/in6.h to be with the rest
> >>>> of the TLV definitions. Label each of the TLV definitions as to whether
> >>>> they are a Hop-by-Hop option, Destination option, or both.
> >>>>
> >>>> Signed-off-by: Tom Herbert <tom@herbertland.com>
> >>>> ---
> >>>> include/uapi/linux/in6.h | 21 ++++++++++++++-------
> >>>> include/uapi/linux/ip6_tunnel.h | 1 -
> >>>> 2 files changed, 14 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h
> >>>> index 5a47339ef7d7..438283dc5fde 100644
> >>>> --- a/include/uapi/linux/in6.h
> >>>> +++ b/include/uapi/linux/in6.h
> >>>> @@ -140,14 +140,21 @@ struct in6_flowlabel_req {
> >>>>
> >>>> /*
> >>>> * IPv6 TLV options.
> >>>> + *
> >>>> + * Hop-by-Hop and Destination options share the same number space.
> >>>> + * For each option below whether it is a Hop-by-Hop option or
> >>>> + * a Destination option is indicated by HBH or DestOpt.
> >>>> */
> >>>> -#define IPV6_TLV_PAD1 0
> >>>> -#define IPV6_TLV_PADN 1
> >>>> -#define IPV6_TLV_ROUTERALERT 5
> >>>> -#define IPV6_TLV_CALIPSO 7 /* RFC 5570 */
> >>>> -#define IPV6_TLV_IOAM 49 /* RFC 9486 */
> >>>> -#define IPV6_TLV_JUMBO 194
> >>>> -#define IPV6_TLV_HAO 201 /* home address option */
> >>>> +#define IPV6_TLV_PAD1 0 /* HBH or DestOpt */
> >>>> +#define IPV6_TLV_PADN 1 /* HBH or DestOpt */
> >>>> +#define IPV6_TLV_TNL_ENCAP_LIMIT 4 /* RFC 2473, DestOpt */
> >>>> +#define IPV6_TLV_ROUTERALERT 5 /* HBH */
> >>>> +#define IPV6_TLV_CALIPSO 7 /* RFC 5570, HBH */
> >>>> +#define IPV6_TLV_IOAM 49 /* RFC 9486, HBH or Destopt
> >>>> + * IOAM sent and rcvd as HBH
> >>>
> >>> Explicit labeling with HBH or Destopt is quite informative.
> >>>
> >>> Does this mean that IPV6_TLV_IOAM should also be accepted in ip6_parse_tlv
> >>> in the Destopt branch? RFC 9486 indeed did reserve a number.
> >>
> >> Nope, not right now. The only IOAM option currently implemented in the
> >> kernel is the Pre-allocated Trace, which uses a Hop-by-Hop option. It
> >> wouldn't make sense to have it in a Destination option, although you
> >> could (i.e., it's not forbidden, just weird). Actually, the only IOAM
> >> option that would make sense to carry in a Destination Option is the
> >> Edge-to-Edge (E2E), but it's not implemented in the kernel. Should it be
> >> implemented at some point, then yes, you'd have IPV6_TLV_IOAM in the
> >> Destopt branch as well.
> >
> > Justin,
> >
> > Conceptually, someone could put IOAM in Destination Options before the
> > Routing Header. There's about 0% of that ever happening though.
>
> Tom,
>
> Correct. However, I wouldn't say there's about 0% of that ever
> happening. At some point, I remember that we even thought about using
> the IOAM Pre-allocated Trace in the Destination Options header (the
> first one, before the RH). The goal was to use it with SRv6 and collect
> telemetry for the overlay. There was also an attempt at including IOAM
> within SRv6 directly (draft-ali-spring-ioam-srv6), which didn't get
> consensus at that time.
Hi Justin,
For sure, there's always going to be someone who wants to use these
things someday. But to date, these have all been nothing more than a
pipe dream for the past forty years. I know that the Internet has its
roots in being an experimental network and a playground, but on the
other hand at some point we have to acknowledge that it's kind of
critical to humankind. We can't keep every option open just because
someone thinks they might use it someday. By definition, any code path
that isn't ever being used is liability, so at some point it's the
greater good to remove it rather than keep it around just in case
someone finally finds a use for it (other than DDoS).
Does it really make sense to keep code paths running in 5 billion
devices of which the vast majority, possibly even all of them, don't
care about? -- I think it's a fair question to at least ask :-).
Tom
>
> Justin
>
> > Tom
> >
> >>
> >>>> + */
> >>>> +#define IPV6_TLV_JUMBO 194 /* HBH */
> >>>> +#define IPV6_TLV_HAO 201 /* home address option, DestOpt */
> >>>>
> >>>> /*
> >>>> * IPV6 socket options
> >>>> diff --git a/include/uapi/linux/ip6_tunnel.h b/include/uapi/linux/ip6_tunnel.h
> >>>> index 85182a839d42..35af4d9c35fb 100644
> >>>> --- a/include/uapi/linux/ip6_tunnel.h
> >>>> +++ b/include/uapi/linux/ip6_tunnel.h
> >>>> @@ -6,7 +6,6 @@
> >>>> #include <linux/if.h> /* For IFNAMSIZ. */
> >>>> #include <linux/in6.h> /* For struct in6_addr. */
> >>>>
> >>>> -#define IPV6_TLV_TNL_ENCAP_LIMIT 4
> >>>> #define IPV6_DEFAULT_TNL_ENCAP_LIMIT 4
> >>>>
> >>>> /* don't add encapsulation limit if one isn't present in inner packet */
> >>>> --
> >>>> 2.43.0
> >>>>
> >>>
> >>>
> >>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2026-02-02 22:37 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-26 19:48 [PATCH net-next v5 0/7] ipv6: Address ext hdr DoS vulnerabilities Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 1/7] ipv6: Check of max HBH or DestOp sysctl is zero and drop if it is Tom Herbert
2026-01-27 17:49 ` Justin Iurman
2026-01-27 17:50 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 2/7] ipv6: Cleanup IPv6 TLV definitions Tom Herbert
2026-01-27 17:51 ` Justin Iurman
2026-01-29 5:30 ` Willem de Bruijn
2026-01-29 18:13 ` Justin Iurman
2026-01-29 19:01 ` Willem de Bruijn
2026-01-30 17:22 ` Tom Herbert
2026-02-01 8:48 ` Justin Iurman
2026-02-02 22:37 ` Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 3/7] ipv6: Add case for IPV6_TLV_TNL_ENCAP_LIMIT in EH TLV switch Tom Herbert
2026-01-27 17:52 ` Justin Iurman
2026-01-29 5:31 ` Willem de Bruijn
2026-01-26 19:48 ` [PATCH net-next v5 4/7] ipv6: Set HBH and DestOpt limits to 2 Tom Herbert
2026-01-27 17:55 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 5/7] ipv6: Document defaults for max_{dst|hbh}_opts_number sysctls Tom Herbert
2026-01-27 17:57 ` Justin Iurman
2026-01-26 19:48 ` [PATCH net-next v5 6/7] ipv6: Enforce Extension Header ordering Tom Herbert
2026-01-27 19:48 ` Justin Iurman
2026-01-29 5:18 ` Willem de Bruijn
2026-01-29 18:07 ` Justin Iurman
2026-01-29 19:05 ` Willem de Bruijn
2026-01-29 20:13 ` Justin Iurman
2026-01-30 17:06 ` Tom Herbert
2026-01-31 17:24 ` Willem de Bruijn
2026-02-02 22:21 ` Tom Herbert
2026-01-26 19:48 ` [PATCH net-next v5 7/7] ipv6: Document enforce_ext_hdr_order sysctl Tom Herbert
2026-01-27 18:00 ` Justin Iurman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox