public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Justin Iurman <justin.iurman@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>, kuba@kernel.org
Cc: edumazet@google.com, dsahern@kernel.org, tom@herbertland.com,
	willemdebruijn.kernel@gmail.com, idosch@nvidia.com,
	pabeni@redhat.com, netdev@vger.kernel.org
Subject: Re: [PATCH net v2] ipv6: Implement limits on extension header parsing
Date: Sat, 25 Apr 2026 12:19:18 +0200	[thread overview]
Message-ID: <b50025ba-c59d-4575-a790-fdaf0a48961d@gmail.com> (raw)
In-Reply-To: <20260425075521.736328-1-daniel@iogearbox.net>

On 4/25/26 09:55, Daniel Borkmann wrote:
> ipv6_{skip_exthdr,find_hdr}() and ip6_{tnl_parse_tlv_enc_lim,
> protocol_deliver_rcu}() iterate over IPv6 extension headers until they
> find a non-extension-header protocol or run out of packet data. The
> loops have no iteration counter, relying solely on the packet length
> to bound them. For a crafted packet with 8-byte extension headers
> filling a 64KB jumbogram, this means a worst case of up to ~8k
> iterations with a skb_header_pointer call each. ipv6_skip_exthdr(),
> for example, is used where it parses the inner quoted packet inside
> an incoming ICMPv6 error:
> 
>    - icmpv6_rcv
>      - checksum validation
>      - case ICMPV6_DEST_UNREACH
>        - icmpv6_notify
>          - pskb_may_pull()       <- pull inner IPv6 header
>          - ipv6_skip_exthdr()    <- iterates here
>          - pskb_may_pull()
>          - ipprot->err_handler() <- sk lookup
> 
> The per-iteration cost of ipv6_skip_exthdr itself is generally
> light, but skb_header_pointer becomes more costly on reassembled
> packets: the first ~1232 bytes of the inner packet are in the skb's
> linear area, but the remaining ~63KB are in the frag_list where
> skb_copy_bits is needed to read data.
> 
> Add a configurable limit via a new sysctl net.ipv6.max_ext_hdrs_number
> (default 8, minimum 1). All four extension header walking functions
> are bound by this limit. The sysctl is in line with commit 47d3d7ac656a
> ("ipv6: Implement limits on Hop-by-Hop and Destination options").
> As documented, init_net is used to derive max_ext_hdrs_number to
> be consistent given a net cannot always reliably be retrieved.
> 
> Note that the check in ip6_protocol_deliver_rcu() happens right
> before the goto resubmit, such that we don't have to have a test
> for ipv6_ext_hdr() in the fast-path.
> 
> There's an ongoing IETF draft-iurman-6man-eh-occurrences to enforce
> IPv6 extension headers ordering and occurrence. The latter also
> discusses security implications. As per RFC8200 section 4.1, the
> occurrence rules for extension headers provide a practical upper
> bound, thus 8 was used as the default.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
>   v1->v2:
>     - Set the default to 8 (Justin)
>     - Update IETF references (Justin)
>     - Add core path coverage as well (Justin)
> 
>   Documentation/networking/ip-sysctl.rst |  7 +++++++
>   include/net/dropreason-core.h          |  6 ++++++
>   include/net/ipv6.h                     |  2 ++
>   include/net/netns/ipv6.h               |  1 +
>   net/ipv6/af_inet6.c                    |  1 +
>   net/ipv6/exthdrs_core.c                | 11 +++++++++++
>   net/ipv6/ip6_input.c                   |  6 ++++++
>   net/ipv6/ip6_tunnel.c                  |  5 +++++
>   net/ipv6/sysctl_net_ipv6.c             |  8 ++++++++
>   9 files changed, 47 insertions(+)
> 

[snip]

> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index 967b07aeb683..a5bbbc16e8d7 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -403,8 +403,10 @@ INDIRECT_CALLABLE_DECLARE(int tcp_v6_rcv(struct sk_buff *));
>   void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
>   			      bool have_final)
>   {
> +	int exthdr_max = READ_ONCE(init_net.ipv6.sysctl.max_ext_hdrs_cnt);
>   	const struct inet6_protocol *ipprot;
>   	struct inet6_dev *idev;
> +	int exthdr_cnt = 0;
>   	unsigned int nhoff;
>   	SKB_DR(reason);
>   	bool raw;
> @@ -487,6 +489,10 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
>   				nexthdr = ret;
>   				goto resubmit_final;
>   			} else {
> +				if (unlikely(exthdr_cnt++ >= exthdr_max)) {
> +					SKB_DR_SET(reason, IPV6_TOO_MANY_EXTHDRS);
> +					goto discard;
> +				}
>   				goto resubmit;
>   			}
>   		} else if (ret == 0) {

The hop-by-hop options header (if present) is not taken into account 
based on the above. However, the max number of extension headers 
(implicitly 7***, as per RFC 8200 Section 4.1) must include it. I 
suggest adding this at the beginning of ip6_protocol_deliver_rcu():

struct inet6_skb_parm *opt = IP6CB(skb);

if (opt->flags & IP6SKB_HOPBYHOP)
	exthdr_cnt++;

*** FYI, rounding to 8 is fine for this fix

> diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c
> index d2cd33e2698d..93f865545a7c 100644
> --- a/net/ipv6/sysctl_net_ipv6.c
> +++ b/net/ipv6/sysctl_net_ipv6.c
> @@ -135,6 +135,14 @@ static struct ctl_table ipv6_table_template[] = {
>   		.extra1		= SYSCTL_ZERO,
>   		.extra2		= &flowlabel_reflect_max,
>   	},
> +	{
> +		.procname	= "max_ext_hdrs_number",
> +		.data		= &init_net.ipv6.sysctl.max_ext_hdrs_cnt,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= SYSCTL_ONE,
> +	},
>   	{
>   		.procname	= "max_dst_opts_number",
>   		.data		= &init_net.ipv6.sysctl.max_dst_opts_cnt,

I've given it a lot of thought. I came to the conclusion that we should 
use a hard-coded value here as well (just like we did for 076b8cad77aa, 
with the same logic), not a sysctl. IMO, the main reason is that it 
provides as is a suitable security fix to be backported, i.e., the max 
value is the max number of EHs allowed by RFC 8200, Section 4.1. Also, 
we remain consistent with draft-iurman-6man-eh-occurrences (I think Tom 
is about to send a revision of the series soon for net-next). What this 
series does is not only enforcing ordering, but also verifying the 
specific number of occurrences for each type of Extension Header. Which 
is totally compatible with what this patch does, i.e., limiting the 
total number of Extension Headers (regardless of their types) to 8. I 
guess what I'm trying to say is that it seems like a good 
plan/compromise and that the aforementioned series would build perfectly 
on top of this fix.

  reply	other threads:[~2026-04-25 10:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-25  7:55 [PATCH net v2] ipv6: Implement limits on extension header parsing Daniel Borkmann
2026-04-25 10:19 ` Justin Iurman [this message]
2026-04-26 10:38   ` Daniel Borkmann
2026-04-26 10:56     ` saeed bishara
2026-04-26 13:17     ` Ido Schimmel
2026-04-26 15:47       ` Justin Iurman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b50025ba-c59d-4575-a790-fdaf0a48961d@gmail.com \
    --to=justin.iurman@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tom@herbertland.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox