public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v5] ipv6: Implement limits on extension header parsing
@ 2026-04-29 15:46 Daniel Borkmann
  2026-04-30 12:40 ` Ido Schimmel
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Daniel Borkmann @ 2026-04-29 15:46 UTC (permalink / raw)
  To: kuba
  Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
	justin.iurman, netdev

ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
protocol_deliver_rcu}() iterate over IPv6 extension headers until they
find a non-extension-header protocol or run out of packet data. The
loops have no iteration counter, relying solely on the packet length
to bound them. For a crafted packet with 8-byte extension headers
filling a 64KB jumbogram, this means a worst case of up to ~8k
iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
for example, is used where it parses the inner quoted packet inside
an incoming ICMPv6 error:

  - icmpv6_rcv
    - checksum validation
    - case ICMPV6_DEST_UNREACH
      - icmpv6_notify
        - pskb_may_pull()       <- pull inner IPv6 header
        - ipv6_skip_exthdr()    <- iterates here
        - pskb_may_pull()
        - ipprot->err_handler() <- sk lookup

The per-iteration cost of ipv6_skip_exthdr itself is generally
light, but skb_header_pointer becomes more costly on reassembled
packets: the first ~1232 bytes of the inner packet are in the skb's
linear area, but the remaining ~63KB are in the frag_list where
skb_copy_bits is needed to read data.

Initially, the idea was to add a configurable limit via a new
sysctl knob with default 8, in line with knobs from commit
47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
options"), but two reasons eventually argued against it:

- It adds to UAPI that needs to be maintained forever, and
  upcoming work is restricting extension header ordering anyway,
  leaving little reason for another sysctl knob
- exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
  where struct net has no .ipv6 member, so the read site would
  need an ifdef'd fallback to a constant anyway

Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
extension header walking functions are now bound by this limit.

Note that the check in ip6_protocol_deliver_rcu() happens right
before the goto resubmit, such that we don't have to have a test
for ipv6_ext_hdr() in the fast-path.

There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
IPv6 extension headers ordering and occurrence. The latter also
discusses security implications. As per RFC8200 section 4.1, the
occurrence rules for extension headers provide a practical upper
bound which is 8. In order to be conservative, let's define
IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups.
In the unlikely event that this is still not enough, then we might
need to reconsider a sysctl.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 v4->v5:
   - Reduce the hard-coded limit from 32 to 12 (Justin)
   - Rest as-is
 v3->v4:
   - Switching to hard-coded define with 32 limit as discussed
     earlier, otherwise fixing the kbuild-bot error becomes
     to ugly.. was worth a try, but given the case of being
     built-in also with CONFIG_IPV6=n define seems better
     (Justin, Victor, kbuild-bot)
 v2->v3:
   - Adding IP6SKB_HOPBYHOP coverage (Justin)
   - I left the limit at 8 w/ sysctl, still feels the better
     option to me if we can keep the worst-case more tightened
 v1->v2:
   - Set the default to 8 (Justin)
   - Update IETF references (Justin)
   - Add core path coverage as well (Justin)

 include/net/dropreason-core.h | 6 ++++++
 include/net/ipv6.h            | 3 +++
 net/ipv6/exthdrs_core.c       | 7 +++++++
 net/ipv6/ip6_input.c          | 5 +++++
 net/ipv6/ip6_tunnel.c         | 4 ++++
 5 files changed, 25 insertions(+)

diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index e0ca3904ff8e..2f312d1f67d6 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -99,6 +99,7 @@
 	FN(FRAG_TOO_FAR)		\
 	FN(TCP_MINTTL)			\
 	FN(IPV6_BAD_EXTHDR)		\
+	FN(IPV6_TOO_MANY_EXTHDRS)	\
 	FN(IPV6_NDISC_FRAG)		\
 	FN(IPV6_NDISC_HOP_LIMIT)	\
 	FN(IPV6_NDISC_BAD_CODE)		\
@@ -494,6 +495,11 @@ enum skb_drop_reason {
 	SKB_DROP_REASON_TCP_MINTTL,
 	/** @SKB_DROP_REASON_IPV6_BAD_EXTHDR: Bad IPv6 extension header. */
 	SKB_DROP_REASON_IPV6_BAD_EXTHDR,
+	/**
+	 * @SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS: Number of IPv6 extension
+	 * headers in the packet exceeds IP6_MAX_EXT_HDRS_CNT.
+	 */
+	SKB_DROP_REASON_IPV6_TOO_MANY_EXTHDRS,
 	/** @SKB_DROP_REASON_IPV6_NDISC_FRAG: invalid frag (suppress_frag_ndisc). */
 	SKB_DROP_REASON_IPV6_NDISC_FRAG,
 	/** @SKB_DROP_REASON_IPV6_NDISC_HOP_LIMIT: invalid hop limit. */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index d042afe7a245..1dec81faff28 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -90,6 +90,9 @@ struct ip_tunnel_info;
 #define IP6_DEFAULT_MAX_DST_OPTS_LEN	 INT_MAX /* No limit */
 #define IP6_DEFAULT_MAX_HBH_OPTS_LEN	 INT_MAX /* No limit */
 
+/* Hard limit on traversed IPv6 extension headers */
+#define IP6_MAX_EXT_HDRS_CNT		 12
+
 /*
  *	Addr type
  *	
diff --git a/net/ipv6/exthdrs_core.c b/net/ipv6/exthdrs_core.c
index 49e31e4ae7b7..9d06d487e8b1 100644
--- a/net/ipv6/exthdrs_core.c
+++ b/net/ipv6/exthdrs_core.c
@@ -73,6 +73,7 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
 		     __be16 *frag_offp)
 {
 	u8 nexthdr = *nexthdrp;
+	int exthdr_cnt = 0;
 
 	*frag_offp = 0;
 
@@ -82,6 +83,8 @@ int ipv6_skip_exthdr(const struct sk_buff *skb, int start, u8 *nexthdrp,
 
 		if (nexthdr == NEXTHDR_NONE)
 			return -1;
+		if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+			return -1;
 		hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr);
 		if (!hp)
 			return -1;
@@ -190,6 +193,7 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
 {
 	unsigned int start = skb_network_offset(skb) + sizeof(struct ipv6hdr);
 	u8 nexthdr = ipv6_hdr(skb)->nexthdr;
+	int exthdr_cnt = 0;
 	bool found;
 
 	if (fragoff)
@@ -216,6 +220,9 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
 			return -ENOENT;
 		}
 
+		if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+			return -EBADMSG;
+
 		hp = skb_header_pointer(skb, start, sizeof(_hdr), &_hdr);
 		if (!hp)
 			return -EBADMSG;
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 967b07aeb683..8972863c93ee 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -403,6 +403,7 @@ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
 void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
 			      bool have_final)
 {
+	int exthdr_cnt = IP6CB(skb)->flags & IP6SKB_HOPBYHOP ? 1 : 0;
 	const struct inet6_protocol *ipprot;
 	struct inet6_dev *idev;
 	unsigned int nhoff;
@@ -487,6 +488,10 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
 				nexthdr = ret;
 				goto resubmit_final;
 			} else {
+				if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT)) {
+					SKB_DR_SET(reason, IPV6_TOO_MANY_EXTHDRS);
+					goto discard;
+				}
 				goto resubmit;
 			}
 		} else if (ret == 0) {
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index c468c83af0f2..9d1037ac082f 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -399,11 +399,15 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff *skb, __u8 *raw)
 	unsigned int nhoff = raw - skb->data;
 	unsigned int off = nhoff + sizeof(*ipv6h);
 	u8 nexthdr = ipv6h->nexthdr;
+	int exthdr_cnt = 0;
 
 	while (ipv6_ext_hdr(nexthdr) && nexthdr != NEXTHDR_NONE) {
 		struct ipv6_opt_hdr *hdr;
 		u16 optlen;
 
+		if (unlikely(exthdr_cnt++ >= IP6_MAX_EXT_HDRS_CNT))
+			break;
+
 		if (!pskb_may_pull(skb, off + sizeof(*hdr)))
 			break;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v5] ipv6: Implement limits on extension header parsing
  2026-04-29 15:46 [PATCH net v5] ipv6: Implement limits on extension header parsing Daniel Borkmann
@ 2026-04-30 12:40 ` Ido Schimmel
  2026-04-30 13:25   ` Eric Dumazet
  2026-04-30 16:53 ` Justin Iurman
  2026-05-01  0:40 ` patchwork-bot+netdevbpf
  2 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2026-04-30 12:40 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: kuba, edumazet, dsahern, tom, willemdebruijn.kernel, pabeni,
	justin.iurman, netdev

On Wed, Apr 29, 2026 at 05:46:48PM +0200, Daniel Borkmann wrote:
> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> find a non-extension-header protocol or run out of packet data. The
> loops have no iteration counter, relying solely on the packet length
> to bound them. For a crafted packet with 8-byte extension headers
> filling a 64KB jumbogram, this means a worst case of up to ~8k
> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> for example, is used where it parses the inner quoted packet inside
> an incoming ICMPv6 error:
> 
>   - icmpv6_rcv
>     - checksum validation
>     - case ICMPV6_DEST_UNREACH
>       - icmpv6_notify
>         - pskb_may_pull()       <- pull inner IPv6 header
>         - ipv6_skip_exthdr()    <- iterates here
>         - pskb_may_pull()
>         - ipprot->err_handler() <- sk lookup
> 
> The per-iteration cost of ipv6_skip_exthdr itself is generally
> light, but skb_header_pointer becomes more costly on reassembled
> packets: the first ~1232 bytes of the inner packet are in the skb's
> linear area, but the remaining ~63KB are in the frag_list where
> skb_copy_bits is needed to read data.
> 
> Initially, the idea was to add a configurable limit via a new
> sysctl knob with default 8, in line with knobs from commit
> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
> options"), but two reasons eventually argued against it:
> 
> - It adds to UAPI that needs to be maintained forever, and
>   upcoming work is restricting extension header ordering anyway,
>   leaving little reason for another sysctl knob
> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
>   where struct net has no .ipv6 member, so the read site would
>   need an ifdef'd fallback to a constant anyway
> 
> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
> extension header walking functions are now bound by this limit.
> 
> Note that the check in ip6_protocol_deliver_rcu() happens right
> before the goto resubmit, such that we don't have to have a test
> for ipv6_ext_hdr() in the fast-path.
> 
> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
> IPv6 extension headers ordering and occurrence. The latter also
> discusses security implications. As per RFC8200 section 4.1, the
> occurrence rules for extension headers provide a practical upper
> bound which is 8. In order to be conservative, let's define
> IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups.
> In the unlikely event that this is still not enough, then we might
> need to reconsider a sysctl.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v5] ipv6: Implement limits on extension header parsing
  2026-04-30 12:40 ` Ido Schimmel
@ 2026-04-30 13:25   ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2026-04-30 13:25 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Daniel Borkmann, kuba, dsahern, tom, willemdebruijn.kernel,
	pabeni, justin.iurman, netdev

On Thu, Apr 30, 2026 at 5:40 AM Ido Schimmel <idosch@nvidia.com> wrote:
>
> On Wed, Apr 29, 2026 at 05:46:48PM +0200, Daniel Borkmann wrote:
> > ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> > protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> > find a non-extension-header protocol or run out of packet data. The
> > loops have no iteration counter, relying solely on the packet length
> > to bound them. For a crafted packet with 8-byte extension headers
> > filling a 64KB jumbogram, this means a worst case of up to ~8k
> > iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> > for example, is used where it parses the inner quoted packet inside
> > an incoming ICMPv6 error:
> >
> >   - icmpv6_rcv
> >     - checksum validation
> >     - case ICMPV6_DEST_UNREACH
> >       - icmpv6_notify
> >         - pskb_may_pull()       <- pull inner IPv6 header
> >         - ipv6_skip_exthdr()    <- iterates here
> >         - pskb_may_pull()
> >         - ipprot->err_handler() <- sk lookup
> >
> > The per-iteration cost of ipv6_skip_exthdr itself is generally
> > light, but skb_header_pointer becomes more costly on reassembled
> > packets: the first ~1232 bytes of the inner packet are in the skb's
> > linear area, but the remaining ~63KB are in the frag_list where
> > skb_copy_bits is needed to read data.
> >
> > Initially, the idea was to add a configurable limit via a new
> > sysctl knob with default 8, in line with knobs from commit
> > 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
> > options"), but two reasons eventually argued against it:
> >
> > - It adds to UAPI that needs to be maintained forever, and
> >   upcoming work is restricting extension header ordering anyway,
> >   leaving little reason for another sysctl knob
> > - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
> >   where struct net has no .ipv6 member, so the read site would
> >   need an ifdef'd fallback to a constant anyway
> >
> > Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
> > extension header walking functions are now bound by this limit.
> >
> > Note that the check in ip6_protocol_deliver_rcu() happens right
> > before the goto resubmit, such that we don't have to have a test
> > for ipv6_ext_hdr() in the fast-path.
> >
> > There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
> > IPv6 extension headers ordering and occurrence. The latter also
> > discusses security implications. As per RFC8200 section 4.1, the
> > occurrence rules for extension headers provide a practical upper
> > bound which is 8. In order to be conservative, let's define
> > IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups.
> > In the unlikely event that this is still not enough, then we might
> > need to reconsider a sysctl.
> >
> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks Daniel!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v5] ipv6: Implement limits on extension header parsing
  2026-04-29 15:46 [PATCH net v5] ipv6: Implement limits on extension header parsing Daniel Borkmann
  2026-04-30 12:40 ` Ido Schimmel
@ 2026-04-30 16:53 ` Justin Iurman
  2026-05-01  0:40 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: Justin Iurman @ 2026-04-30 16:53 UTC (permalink / raw)
  To: Daniel Borkmann, kuba
  Cc: edumazet, dsahern, tom, willemdebruijn.kernel, idosch, pabeni,
	netdev

On 4/29/26 17:46, Daniel Borkmann wrote:
> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> find a non-extension-header protocol or run out of packet data. The
> loops have no iteration counter, relying solely on the packet length
> to bound them. For a crafted packet with 8-byte extension headers
> filling a 64KB jumbogram, this means a worst case of up to ~8k
> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> for example, is used where it parses the inner quoted packet inside
> an incoming ICMPv6 error:
> 
>    - icmpv6_rcv
>      - checksum validation
>      - case ICMPV6_DEST_UNREACH
>        - icmpv6_notify
>          - pskb_may_pull()       <- pull inner IPv6 header
>          - ipv6_skip_exthdr()    <- iterates here
>          - pskb_may_pull()
>          - ipprot->err_handler() <- sk lookup
> 
> The per-iteration cost of ipv6_skip_exthdr itself is generally
> light, but skb_header_pointer becomes more costly on reassembled
> packets: the first ~1232 bytes of the inner packet are in the skb's
> linear area, but the remaining ~63KB are in the frag_list where
> skb_copy_bits is needed to read data.
> 
> Initially, the idea was to add a configurable limit via a new
> sysctl knob with default 8, in line with knobs from commit
> 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and Destination
> options"), but two reasons eventually argued against it:
> 
> - It adds to UAPI that needs to be maintained forever, and
>    upcoming work is restricting extension header ordering anyway,
>    leaving little reason for another sysctl knob
> - exthdrs_core.c is always built-in even when CONFIG_IPV6=n,
>    where struct net has no .ipv6 member, so the read site would
>    need an ifdef'd fallback to a constant anyway
> 
> Therefore, just use a constant (IP6_MAX_EXT_HDRS_CNT). All four
> extension header walking functions are now bound by this limit.
> 
> Note that the check in ip6_protocol_deliver_rcu() happens right
> before the goto resubmit, such that we don't have to have a test
> for ipv6_ext_hdr() in the fast-path.
> 
> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
> IPv6 extension headers ordering and occurrence. The latter also
> discusses security implications. As per RFC8200 section 4.1, the
> occurrence rules for extension headers provide a practical upper
> bound which is 8. In order to be conservative, let's define
> IP6_MAX_EXT_HDRS_CNT as 12 to leave enough room for quirky setups.
> In the unlikely event that this is still not enough, then we might
> need to reconsider a sysctl.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Reviewed-by: Justin Iurman <justin.iurman@gmail.com>

Thanks for your patience, Daniel!

Cheers,
Justin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net v5] ipv6: Implement limits on extension header parsing
  2026-04-29 15:46 [PATCH net v5] ipv6: Implement limits on extension header parsing Daniel Borkmann
  2026-04-30 12:40 ` Ido Schimmel
  2026-04-30 16:53 ` Justin Iurman
@ 2026-05-01  0:40 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-01  0:40 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: kuba, edumazet, dsahern, tom, willemdebruijn.kernel, idosch,
	pabeni, justin.iurman, netdev

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 29 Apr 2026 17:46:48 +0200 you wrote:
> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> find a non-extension-header protocol or run out of packet data. The
> loops have no iteration counter, relying solely on the packet length
> to bound them. For a crafted packet with 8-byte extension headers
> filling a 64KB jumbogram, this means a worst case of up to ~8k
> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> for example, is used where it parses the inner quoted packet inside
> an incoming ICMPv6 error:
> 
> [...]

Here is the summary with links:
  - [net,v5] ipv6: Implement limits on extension header parsing
    https://git.kernel.org/netdev/net/c/3744b0964d52

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-01  0:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 15:46 [PATCH net v5] ipv6: Implement limits on extension header parsing Daniel Borkmann
2026-04-30 12:40 ` Ido Schimmel
2026-04-30 13:25   ` Eric Dumazet
2026-04-30 16:53 ` Justin Iurman
2026-05-01  0:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox