* [PATCH net v4] ipv6: Implement limits on extension header parsing
@ 2026-04-28 15:37 Daniel Borkmann
2026-04-28 17:58 ` Justin Iurman
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Borkmann @ 2026-04-28 15:37 UTC (permalink / raw)
To: kuba
Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
justin.iurman, netdev
ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
protocol_deliver_rcu}() iterate over IPv6 extension headers until they
find a non-extension-header protocol or run out of packet data. The
loops have no iteration counter, relying solely on the packet length
to bound them. For a crafted packet with 8-byte extension headers
filling a 64KB jumbogram, this means a worst case of up to ~8k
iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
for example, is used where it parses the inner quoted packet inside
an incoming ICMPv6 error:
- icmpv6_rcv
- checksum validation
- case ICMPV6_DEST_UNREACH
- icmpv6_notify
- pskb_may_pull() <- pull inner IPv6 header
- ipv6_skip_exthdr() <- iterates here
- pskb_may_pull()
- ipprot->err_handler() <- sk lookup
The per-iteration cost of ipv6_skip_exthdr itself is generally
light, but skb_header_pointer becomes more costly on reassembled
packets: the first ~1232 bytes of the inner packet are in the skb's
linear area, but the remaining ~63KB are in the frag_list where
skb_copy_bits is needed to read data.
Initially, the idea was to add a configurable limit via a new
sysctl knob with default 8, in line with knobs from commit
47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
options"), but two reasons eventually argued against it:
- It adds to UAPI that needs to be maintained forever, and
upcoming work is restricting extension header ordering anyway,
leaving little reason for another sysctl knob
- exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
where struct net has no .ipv6 member, so the read site would
need an ifdef'd fallback to a constant anyway
Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
extension header walking functions are now bound by this limit.
Note that the check in ip6_protocol_deliver_rcu() happens right
before the goto resubmit, such that we don't have to have a test
for ipv6_ext_hdr() in the fast-path.
There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
IPv6 extension headers ordering and occurrence. The latter also
discusses security implications. As per RFC8200 section 4.1, the
occurrence rules for extension headers provide a practical upper
bound which is 8. In order to be conservative, let's define
IP6_MAX_EXT_HDRS_CNT as 4x that to leave enough room for quirky
setups. In the unlikely event that this is still not enough, then
we might need to reconsider a sysctl.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
v3->v4:
- Switching to hard-coded define with 32 limit as discussed
earlier, otherwise fixing the kbuild-bot error becomes
to ugly.. was worth a try, but given the case of being
built-in also with CONFIG_IPV6=n define seems better
(Justin, Victor, kbuild-bot)
v2->v3:
- Adding IP6SKB_HOPBYHOP coverage (Justin)
- I left the limit at 8 w/ sysctl, still feels the better
option to me if we can keep the worst-case more tightened
v1->v2:
- Set the default to 8 (Justin)
- Update IETF references (Justin)
- Add core path coverage as well (Justin)
include/net/dropreason-core.h | 6 ++++++
include/net/ipv6.h | 3 +++
net/ipv6/exthdrs_core.c | 7 +++++++
net/ipv6/ip6_input.c | 5 +++++
net/ipv6/ip6_tunnel.c | 4 ++++
5 files changed, 25 insertions(+)
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index e0ca3904ff8e..2f312d1f67d6 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -99,6 +99,7 @@
FN(FRAG_TOO_FAR) \
FN(TCP_MINTTL) \
FN(IPV6_BAD_EXTHDR) \
+ FN(IPV6_TOO_MANY_EXTHDRS) \
FN(IPV6_NDISC_FRAG) \
FN(IPV6_NDISC_HOP_LIMIT) \
FN(IPV6_NDISC_BAD_CODE) \
@@ -494,6 +495,11 @@ enum skb_drop_reason {
SKB_DROP_REASON_TCP_MINTTL,
/** @SKB_DROP_REASON_IPV6_BAD_EXTHDR: Bad IPv6 extension header. */
SKB_DROP_REASON_IPV6_BAD_EXTHDR,
+ /**
+ * @SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS: Number of IPv6 extension
+ * headers in the packet exceeds IP6_MAX_EXT_HDRS_CNT.
+ */
+ SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS,
/** @SKB_DROP_REASON_IPV6_NDISC_FRAG: invalid frag (suppress_frag_ndisc). */
SKB_DROP_REASON_IPV6_NDISC_FRAG,
/** @SKB_DROP_REASON_IPV6_NDISC_HOP_LIMIT: invalid hop limit. */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d042afe7a245..118a24ac9e73 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -90,6 +90,9 @@ struct ip_tunnel_info;
#define IP6_DEFAULT_MAX_DST_OPTS_LEN INT_MAX /* No limit */
#define IP6_DEFAULT_MAX_HBH_OPTS_LEN INT_MAX /* No limit */
+/* Hard limit on traversed IPv6 extension headers */
+#define IP6_MAX_EXT_HDRS_CNT 32
+
/*
* Addr type
*
diff --git a/net/ipv6/exthdrs_core.c b/net/ipv6/exthdrs_core.c
index 49e31e4ae7b7..9d06d487e8b1 100644
--- a/net/ipv6/exthdrs_core.c
+++ b/net/ipv6/exthdrs_core.c
@@ -73,6 +73,7 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
__be16 *frag_offp)
{
u8 nexthdr = *nexthdrp;
+ int exthdr_cnt = 0;
*frag_offp = 0;
@@ -82,6 +83,8 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
if (nexthdr == NEXTHDR_NONE)
return -1;
+ if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+ return -1;
hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr);
if (!hp)
return -1;
@@ -190,6 +193,7 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
{
unsigned int start = skb_network_offset(skb) + sizeof(struct ipv6hdr);
u8 nexthdr = ipv6_hdr(skb)->nexthdr;
+ int exthdr_cnt = 0;
bool found;
if (fragoff)
@@ -216,6 +220,9 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
return -ENOENT;
}
+ if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+ return -EBADMSG;
+
hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr);
if (!hp)
return -EBADMSG;
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 967b07aeb683..8972863c93ee 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -403,6 +403,7 @@ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
bool have_final)
{
+ int exthdr_cnt = IP6CB(skb)->flags & IP6SKB_HOPBYHOP ? 1 : 0;
const struct inet6_protocol *ipprot;
struct inet6_dev *idev;
unsigned int nhoff;
@@ -487,6 +488,10 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
nexthdr = ret;
goto resubmit_final;
} else {
+ if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) {
+ SKB_DR_SET(reason, IPV6_TOO_MANY_EXTHDRS);
+ goto discard;
+ }
goto resubmit;
}
} else if (ret == 0) {
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index c468c83af0f2..9d1037ac082f 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -399,11 +399,15 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff *skb, __u8 *raw)
unsigned int nhoff = raw - skb->data;
unsigned int off = nhoff + sizeof(*ipv6h);
u8 nexthdr = ipv6h->nexthdr;
+ int exthdr_cnt = 0;
while (ipv6_ext_hdr(nexthdr) && nexthdr != NEXTHDR_NONE) {
struct ipv6_opt_hdr *hdr;
u16 optlen;
+ if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+ break;
+
if (!pskb_may_pull(skb, off + sizeof(*hdr)))
break;
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net v4] ipv6: Implement limits on extension header parsing
2026-04-28 15:37 [PATCH net v4] ipv6: Implement limits on extension header parsing Daniel Borkmann
@ 2026-04-28 17:58 ` Justin Iurman
2026-04-28 18:16 ` Daniel Borkmann
0 siblings, 1 reply; 5+ messages in thread
From: Justin Iurman @ 2026-04-28 17:58 UTC (permalink / raw)
To: Daniel Borkmann, kuba
Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
netdev
On 4/28/26 17:37, Daniel Borkmann wrote:
> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> find a non-extension-header protocol or run out of packet data. The
> loops have no iteration counter, relying solely on the packet length
> to bound them. For a crafted packet with 8-byte extension headers
> filling a 64KB jumbogram, this means a worst case of up to ~8k
> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> for example, is used where it parses the inner quoted packet inside
> an incoming ICMPv6 error:
>
> - icmpv6_rcv
> - checksum validation
> - case ICMPV6_DEST_UNREACH
> - icmpv6_notify
> - pskb_may_pull() <- pull inner IPv6 header
> - ipv6_skip_exthdr() <- iterates here
> - pskb_may_pull()
> - ipprot->err_handler() <- sk lookup
>
> The per-iteration cost of ipv6_skip_exthdr itself is generally
> light, but skb_header_pointer becomes more costly on reassembled
> packets: the first ~1232 bytes of the inner packet are in the skb's
> linear area, but the remaining ~63KB are in the frag_list where
> skb_copy_bits is needed to read data.
>
> Initially, the idea was to add a configurable limit via a new
> sysctl knob with default 8, in line with knobs from commit
> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
> options"), but two reasons eventually argued against it:
>
> - It adds to UAPI that needs to be maintained forever, and
> upcoming work is restricting extension header ordering anyway,
> leaving little reason for another sysctl knob
> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
> where struct net has no .ipv6 member, so the read site would
> need an ifdef'd fallback to a constant anyway
>
> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
> extension header walking functions are now bound by this limit.
>
> Note that the check in ip6_protocol_deliver_rcu() happens right
> before the goto resubmit, such that we don't have to have a test
> for ipv6_ext_hdr() in the fast-path.
>
> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
> IPv6 extension headers ordering and occurrence. The latter also
> discusses security implications. As per RFC8200 section 4.1, the
> occurrence rules for extension headers provide a practical upper
> bound which is 8. In order to be conservative, let's define
> IP6_MAX_EXT_HDRS_CNT as 4x that to leave enough room for quirky
> setups. In the unlikely event that this is still not enough, then
> we might need to reconsider a sysctl.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
FWIW, I prefer v4 over v3. I didn't like the idea of a new sysctl for
several reasons I already mentioned. So, thanks for this new version,
Daniel.
However, I can't help but think 8 would pose absolutely no problem. I
would be very surprised to see someone complain. We're choosing 32 over
8 to be extra safe, at the price of security. RFC8200 provides the
following ordered list which is recommended for senders (without
normative language, though, which was a mistake):
Hop-by-Hop Options header
Destination Options header
Routing header
Fragment header
Authentication header
Encapsulating Security Payload header
Destination Options header
So this is the maximum you can have theoretically***, although you
wouldn't for instance use the Authentication header with ESP. I'm not
even talking about ordering or specific number of occurrences here, just
the total number of Extension Headers in a packet (as your patch does).
It's also worth mentioning that it's highly unlikely to see someone use
them all at the same time (in production, of course). This is why I
still think that 8 is safe too, and would provide security as expected.
***well, you also have 3 others [1] (Mobility Header, Host Identity
Protocol, Shim6 Protocol), but they're not widely used (not to say
dead). And they would probably not be used with other Extension Headers
anyway, as they are not specified to be present with a layer-4 in most
cases.
[1]
https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#extension-header
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net v4] ipv6: Implement limits on extension header parsing
2026-04-28 17:58 ` Justin Iurman
@ 2026-04-28 18:16 ` Daniel Borkmann
2026-04-28 18:54 ` Justin Iurman
0 siblings, 1 reply; 5+ messages in thread
From: Daniel Borkmann @ 2026-04-28 18:16 UTC (permalink / raw)
To: Justin Iurman, kuba
Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
netdev
On 4/28/26 7:58 PM, Justin Iurman wrote:
> On 4/28/26 17:37, Daniel Borkmann wrote:
>> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
>> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
>> find a non-extension-header protocol or run out of packet data. The
>> loops have no iteration counter, relying solely on the packet length
>> to bound them. For a crafted packet with 8-byte extension headers
>> filling a 64KB jumbogram, this means a worst case of up to ~8k
>> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
>> for example, is used where it parses the inner quoted packet inside
>> an incoming ICMPv6 error:
>>
>> - icmpv6_rcv
>> - checksum validation
>> - case ICMPV6_DEST_UNREACH
>> - icmpv6_notify
>> - pskb_may_pull() <- pull inner IPv6 header
>> - ipv6_skip_exthdr() <- iterates here
>> - pskb_may_pull()
>> - ipprot->err_handler() <- sk lookup
>>
>> The per-iteration cost of ipv6_skip_exthdr itself is generally
>> light, but skb_header_pointer becomes more costly on reassembled
>> packets: the first ~1232 bytes of the inner packet are in the skb's
>> linear area, but the remaining ~63KB are in the frag_list where
>> skb_copy_bits is needed to read data.
>>
>> Initially, the idea was to add a configurable limit via a new
>> sysctl knob with default 8, in line with knobs from commit
>> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
>> options"), but two reasons eventually argued against it:
>>
>> - It adds to UAPI that needs to be maintained forever, and
>> upcoming work is restricting extension header ordering anyway,
>> leaving little reason for another sysctl knob
>> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
>> where struct net has no .ipv6 member, so the read site would
>> need an ifdef'd fallback to a constant anyway
>>
>> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
>> extension header walking functions are now bound by this limit.
>>
>> Note that the check in ip6_protocol_deliver_rcu() happens right
>> before the goto resubmit, such that we don't have to have a test
>> for ipv6_ext_hdr() in the fast-path.
>>
>> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
>> IPv6 extension headers ordering and occurrence. The latter also
>> discusses security implications. As per RFC8200 section 4.1, the
>> occurrence rules for extension headers provide a practical upper
>> bound which is 8. In order to be conservative, let's define
>> IP6_MAX_EXT_HDRS_CNT as 4x that to leave enough room for quirky
>> setups. In the unlikely event that this is still not enough, then
>> we might need to reconsider a sysctl.
>>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>
> Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
>
> FWIW, I prefer v4 over v3. I didn't like the idea of a new sysctl for several reasons I already mentioned. So, thanks for this new version, Daniel.
>
> However, I can't help but think 8 would pose absolutely no problem. I would be very surprised to see someone complain. We're choosing 32 over 8 to be extra safe, at the price of security. RFC8200 provides the following ordered list which is recommended for senders (without normative language, though, which was a mistake):
>
> Hop-by-Hop Options header
> Destination Options header
> Routing header
> Fragment header
> Authentication header
> Encapsulating Security Payload header
> Destination Options header
>
> So this is the maximum you can have theoretically***, although you wouldn't for instance use the Authentication header with ESP. I'm not even talking about ordering or specific number of occurrences here, just the total number of Extension Headers in a packet (as your patch does). It's also worth mentioning that it's highly unlikely to see someone use them all at the same time (in production, of course). This is why I still think that 8 is safe too, and would provide security as expected.
What do you think if we fix this at 12 then to also have the exotic cases
below covered just in case to be conservative? Would still be an improvement
over 32 fwiw. I'm also fine if you think straight to 8 is the better choice.
Either option I can spin a v5, np.
Thanks,
Daniel
> ***well, you also have 3 others [1] (Mobility Header, Host Identity Protocol, Shim6 Protocol), but they're not widely used (not to say dead). And they would probably not be used with other Extension Headers anyway, as they are not specified to be present with a layer-4 in most cases.
>
> [1] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#extension-header
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net v4] ipv6: Implement limits on extension header parsing
2026-04-28 18:16 ` Daniel Borkmann
@ 2026-04-28 18:54 ` Justin Iurman
2026-04-28 19:47 ` Daniel Borkmann
0 siblings, 1 reply; 5+ messages in thread
From: Justin Iurman @ 2026-04-28 18:54 UTC (permalink / raw)
To: Daniel Borkmann, kuba
Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
netdev
On 4/28/26 20:16, Daniel Borkmann wrote:
> On 4/28/26 7:58 PM, Justin Iurman wrote:
>> On 4/28/26 17:37, Daniel Borkmann wrote:
>>> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
>>> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
>>> find a non-extension-header protocol or run out of packet data. The
>>> loops have no iteration counter, relying solely on the packet length
>>> to bound them. For a crafted packet with 8-byte extension headers
>>> filling a 64KB jumbogram, this means a worst case of up to ~8k
>>> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
>>> for example, is used where it parses the inner quoted packet inside
>>> an incoming ICMPv6 error:
>>>
>>> - icmpv6_rcv
>>> - checksum validation
>>> - case ICMPV6_DEST_UNREACH
>>> - icmpv6_notify
>>> - pskb_may_pull() <- pull inner IPv6 header
>>> - ipv6_skip_exthdr() <- iterates here
>>> - pskb_may_pull()
>>> - ipprot->err_handler() <- sk lookup
>>>
>>> The per-iteration cost of ipv6_skip_exthdr itself is generally
>>> light, but skb_header_pointer becomes more costly on reassembled
>>> packets: the first ~1232 bytes of the inner packet are in the skb's
>>> linear area, but the remaining ~63KB are in the frag_list where
>>> skb_copy_bits is needed to read data.
>>>
>>> Initially, the idea was to add a configurable limit via a new
>>> sysctl knob with default 8, in line with knobs from commit
>>> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
>>> options"), but two reasons eventually argued against it:
>>>
>>> - It adds to UAPI that needs to be maintained forever, and
>>> upcoming work is restricting extension header ordering anyway,
>>> leaving little reason for another sysctl knob
>>> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
>>> where struct net has no .ipv6 member, so the read site would
>>> need an ifdef'd fallback to a constant anyway
>>>
>>> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
>>> extension header walking functions are now bound by this limit.
>>>
>>> Note that the check in ip6_protocol_deliver_rcu() happens right
>>> before the goto resubmit, such that we don't have to have a test
>>> for ipv6_ext_hdr() in the fast-path.
>>>
>>> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
>>> IPv6 extension headers ordering and occurrence. The latter also
>>> discusses security implications. As per RFC8200 section 4.1, the
>>> occurrence rules for extension headers provide a practical upper
>>> bound which is 8. In order to be conservative, let's define
>>> IP6_MAX_EXT_HDRS_CNT as 4x that to leave enough room for quirky
>>> setups. In the unlikely event that this is still not enough, then
>>> we might need to reconsider a sysctl.
>>>
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>
>> Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
>>
>> FWIW, I prefer v4 over v3. I didn't like the idea of a new sysctl for
>> several reasons I already mentioned. So, thanks for this new version,
>> Daniel.
>>
>> However, I can't help but think 8 would pose absolutely no problem. I
>> would be very surprised to see someone complain. We're choosing 32
>> over 8 to be extra safe, at the price of security. RFC8200 provides
>> the following ordered list which is recommended for senders (without
>> normative language, though, which was a mistake):
>>
>> Hop-by-Hop Options header
>> Destination Options header
>> Routing header
>> Fragment header
>> Authentication header
>> Encapsulating Security Payload header
>> Destination Options header
>>
>> So this is the maximum you can have theoretically***, although you
>> wouldn't for instance use the Authentication header with ESP. I'm not
>> even talking about ordering or specific number of occurrences here,
>> just the total number of Extension Headers in a packet (as your patch
>> does). It's also worth mentioning that it's highly unlikely to see
>> someone use them all at the same time (in production, of course). This
>> is why I still think that 8 is safe too, and would provide security as
>> expected.
>
> What do you think if we fix this at 12 then to also have the exotic cases
> below covered just in case to be conservative? Would still be an
> improvement
> over 32 fwiw. I'm also fine if you think straight to 8 is the better
> choice.
> Either option I can spin a v5, np.
I'm fine either way. IMO, 12 seems like a good compromise to keep
everyone happy.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net v4] ipv6: Implement limits on extension header parsing
2026-04-28 18:54 ` Justin Iurman
@ 2026-04-28 19:47 ` Daniel Borkmann
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Borkmann @ 2026-04-28 19:47 UTC (permalink / raw)
To: Justin Iurman, kuba
Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
netdev
On 4/28/26 8:54 PM, Justin Iurman wrote:
> On 4/28/26 20:16, Daniel Borkmann wrote:
>> On 4/28/26 7:58 PM, Justin Iurman wrote:
>>> On 4/28/26 17:37, Daniel Borkmann wrote:
>>>> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
>>>> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
>>>> find a non-extension-header protocol or run out of packet data. The
>>>> loops have no iteration counter, relying solely on the packet length
>>>> to bound them. For a crafted packet with 8-byte extension headers
>>>> filling a 64KB jumbogram, this means a worst case of up to ~8k
>>>> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
>>>> for example, is used where it parses the inner quoted packet inside
>>>> an incoming ICMPv6 error:
>>>>
>>>> - icmpv6_rcv
>>>> - checksum validation
>>>> - case ICMPV6_DEST_UNREACH
>>>> - icmpv6_notify
>>>> - pskb_may_pull() <- pull inner IPv6 header
>>>> - ipv6_skip_exthdr() <- iterates here
>>>> - pskb_may_pull()
>>>> - ipprot->err_handler() <- sk lookup
>>>>
>>>> The per-iteration cost of ipv6_skip_exthdr itself is generally
>>>> light, but skb_header_pointer becomes more costly on reassembled
>>>> packets: the first ~1232 bytes of the inner packet are in the skb's
>>>> linear area, but the remaining ~63KB are in the frag_list where
>>>> skb_copy_bits is needed to read data.
>>>>
>>>> Initially, the idea was to add a configurable limit via a new
>>>> sysctl knob with default 8, in line with knobs from commit
>>>> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
>>>> options"), but two reasons eventually argued against it:
>>>>
>>>> - It adds to UAPI that needs to be maintained forever, and
>>>> upcoming work is restricting extension header ordering anyway,
>>>> leaving little reason for another sysctl knob
>>>> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
>>>> where struct net has no .ipv6 member, so the read site would
>>>> need an ifdef'd fallback to a constant anyway
>>>>
>>>> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
>>>> extension header walking functions are now bound by this limit.
>>>>
>>>> Note that the check in ip6_protocol_deliver_rcu() happens right
>>>> before the goto resubmit, such that we don't have to have a test
>>>> for ipv6_ext_hdr() in the fast-path.
>>>>
>>>> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
>>>> IPv6 extension headers ordering and occurrence. The latter also
>>>> discusses security implications. As per RFC8200 section 4.1, the
>>>> occurrence rules for extension headers provide a practical upper
>>>> bound which is 8. In order to be conservative, let's define
>>>> IP6_MAX_EXT_HDRS_CNT as 4x that to leave enough room for quirky
>>>> setups. In the unlikely event that this is still not enough, then
>>>> we might need to reconsider a sysctl.
>>>>
>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>
>>> Reviewed-by: Justin Iurman <justin.iurman@gmail.com>
>>>
>>> FWIW, I prefer v4 over v3. I didn't like the idea of a new sysctl for several reasons I already mentioned. So, thanks for this new version, Daniel.
>>>
>>> However, I can't help but think 8 would pose absolutely no problem. I would be very surprised to see someone complain. We're choosing 32 over 8 to be extra safe, at the price of security. RFC8200 provides the following ordered list which is recommended for senders (without normative language, though, which was a mistake):
>>>
>>> Hop-by-Hop Options header
>>> Destination Options header
>>> Routing header
>>> Fragment header
>>> Authentication header
>>> Encapsulating Security Payload header
>>> Destination Options header
>>>
>>> So this is the maximum you can have theoretically***, although you wouldn't for instance use the Authentication header with ESP. I'm not even talking about ordering or specific number of occurrences here, just the total number of Extension Headers in a packet (as your patch does). It's also worth mentioning that it's highly unlikely to see someone use them all at the same time (in production, of course). This is why I still think that 8 is safe too, and would provide security as expected.
>>
>> What do you think if we fix this at 12 then to also have the exotic cases
>> below covered just in case to be conservative? Would still be an improvement
>> over 32 fwiw. I'm also fine if you think straight to 8 is the better choice.
>> Either option I can spin a v5, np.
>
> I'm fine either way. IMO, 12 seems like a good compromise to keep everyone happy.
Ack, sounds good, I'll do that in a v5 tomorrow.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-28 19:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 15:37 [PATCH net v4] ipv6: Implement limits on extension header parsing Daniel Borkmann
2026-04-28 17:58 ` Justin Iurman
2026-04-28 18:16 ` Daniel Borkmann
2026-04-28 18:54 ` Justin Iurman
2026-04-28 19:47 ` Daniel Borkmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox