public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Justin Iurman <justin.iurman@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>, kuba@kernel.org
Cc: edumazet@google.com, dsahern@kernel.org, tom@herbertland.com,
	willemdebruijn.kernel@gmail.com, idosch@nvidia.com,
	pabeni@redhat.com, netdev@vger.kernel.org
Subject: Re: [PATCH net v2] ipv6: Apply max_dst_opts_cnt to ip6_tnl_parse_tlv_enc_lim
Date: Sun, 19 Apr 2026 00:37:35 +0200	[thread overview]
Message-ID: <8fdf517b-6217-4df6-8adf-0c79ce8d3be8@gmail.com> (raw)
In-Reply-To: <d8d60970-6f73-4b9d-8808-76f4f8023100@iogearbox.net>

On 4/19/26 00:19, Daniel Borkmann wrote:
> On 4/18/26 2:40 PM, Justin Iurman wrote:
>> On 4/18/26 14:15, Daniel Borkmann wrote:
>>> Commit 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and
>>> Destination options") added net.ipv6.max_{hbh,dst}_opts_{cnt,len}
>>> and applied them in ip6_parse_tlv(), the generic TLV walker
>>> invoked from ipv6_destopt_rcv() and ipv6_parse_hopopts().
>>>
>>> ip6_tnl_parse_tlv_enc_lim() does not go through ip6_parse_tlv();
>>> it has its own hand-rolled TLV scanner inside its NEXTHDR_DEST
>>> branch which looks for IPV6_TLV_TNL_ENCAP_LIMIT. That inner
>>> loop is bounded only by optlen, which can be up to 2048 bytes.
>>> Stuffing the Destination Options header with 2046 Pad1 (type=0)
>>> entries advances the scanner a single byte at a time, yielding
>>> ~2000 TLV iterations per extension header.
>>>
>>> Reuse max_dst_opts_cnt to bound the TLV iterations, matching
>>> the semantics from 47d3d7ac656a.
>>>
>>> Fixes: 47d3d7ac656a ("ipv6: Implement limits on Hop-by-Hop and 
>>> Destination options")
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>> ---
>>>   v1->v2:
>>>     - Remove unlikely (Justin)
>>>     - Use abs() given max_dst_opts_cnt's negative meaning (Justin)
>>>
>>>   net/ipv6/ip6_tunnel.c | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
>>> index 907c6a2af331..0f50b7fcb24e 100644
>>> --- a/net/ipv6/ip6_tunnel.c
>>> +++ b/net/ipv6/ip6_tunnel.c
>>> @@ -430,11 +430,16 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff 
>>> *skb, __u8 *raw)
>>>                   break;
>>>           }
>>>           if (nexthdr == NEXTHDR_DEST) {
>>> +            int tlv_max = 
>>> abs(READ_ONCE(init_net.ipv6.sysctl.max_dst_opts_cnt));
>>> +            int tlv_cnt = 0;
>>>               u16 i = 2;
>>>               while (1) {
>>>                   struct ipv6_tlv_tnl_enc_lim *tel;
>>> +                if (tlv_cnt++ >= tlv_max)
>>> +                    break;
>>> +
>>>                   /* No more room for encapsulation limit */
>>>                   if (i + sizeof(*tel) > optlen)
>>>                       break;
>>
>> Thanks for v2, Daniel.
>>
>> I'm still wondering: should we align the above parsing behavior with 
>> the one in ip6_parse_tlv() to keep things consistent? That is: don't 
>> increment tlv_cnt for Pad1/PadN, make sure we don't exceed 8 bytes per 
>> padding (consecutive Pad1's, or a PadN), and we could also check that 
>> a PadN payload is only made of zeroes. Open question...
> 
> Hm, so it would be sth like the below on top of this one, I'd wish we 
> wouldn't
> have to open code another ip6_parse_tlv().
> 
> Have you seen such cases with the encap limit in the wild (vs encap 
> limit as
> first tlv) where a stack places leading Pad1/PadN before encap limit 
> which the
> v2 patch wouldn't catch?

Nope. But if it happens, users would be confused as max_dst_opts_cnt 
would not have the same meaning in two different code paths. OTOH, I 
agree that such situation would look suspicious. I guess it's fine to 
keep your patch as is and to not over-complicate things unnecessarily.

>   net/ipv6/ip6_tunnel.c | 42 +++++++++++++++++++++++++++++++-----------
>   1 file changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
> index 0f50b7fcb24e..1fa42a1cd97c 100644
> --- a/net/ipv6/ip6_tunnel.c
> +++ b/net/ipv6/ip6_tunnel.c
> @@ -432,28 +432,48 @@ __u16 ip6_tnl_parse_tlv_enc_lim(struct sk_buff 
> *skb, __u8 *raw)
>           if (nexthdr == NEXTHDR_DEST) {
>               int tlv_max = 
> abs(READ_ONCE(init_net.ipv6.sysctl.max_dst_opts_cnt));
>               int tlv_cnt = 0;
> +            int padlen = 0;
>               u16 i = 2;
> 
> -            while (1) {
> -                struct ipv6_tlv_tnl_enc_lim *tel;
> +            while (i < optlen) {
> +                struct ipv6_tlv_tnl_enc_lim *tel =
> +                    (struct ipv6_tlv_tnl_enc_lim *)(skb->data + off + i);
> +                int tel_len;
> 
> -                if (tlv_cnt++ >= tlv_max)
> +                if (tel->type == IPV6_TLV_PAD1) {
> +                    if (++padlen > 7)
> +                        break;
> +                    i++;
> +                    continue;
> +                }
> +
> +                if (i + 2 > optlen)
> +                    break;
> +                tel_len = tel->length + 2;
> +                if (i + tel_len > optlen)
>                       break;
> 
> -                /* No more room for encapsulation limit */
> -                if (i + sizeof(*tel) > optlen)
> +                if (tel->type == IPV6_TLV_PADN) {
> +                    padlen += tel_len;
> +                    if (padlen > 7)
> +                        break;
> +                    if (memchr_inv((u8 *)tel + 2, 0,
> +                               tel->length))
> +                        break;
> +                    i += tel_len;
> +                    continue;
> +                }
> +                padlen = 0;
> +
> +                if (tlv_cnt++ >= tlv_max)
>                       break;
> 
> -                tel = (struct ipv6_tlv_tnl_enc_lim *)(skb->data + off + 
> i);
>                   /* return index of option if found and valid */
>                   if (tel->type == IPV6_TLV_TNL_ENCAP_LIMIT &&
>                       tel->length == 1)
>                       return i + off - nhoff;
> -                /* else jump to next option */
> -                if (tel->type)
> -                    i += tel->length + 2;
> -                else
> -                    i++;
> +
> +                i += tel_len;
>               }
>           }
>           nexthdr = hdr->nexthdr;


  reply	other threads:[~2026-04-18 22:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-18 12:15 [PATCH net v2] ipv6: Apply max_dst_opts_cnt to ip6_tnl_parse_tlv_enc_lim Daniel Borkmann
2026-04-18 12:40 ` Justin Iurman
2026-04-18 22:19   ` Daniel Borkmann
2026-04-18 22:37     ` Justin Iurman [this message]
2026-04-19 14:31       ` Ido Schimmel
2026-04-20 18:55         ` Justin Iurman
2026-04-21  7:33           ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8fdf517b-6217-4df6-8adf-0c79ce8d3be8@gmail.com \
    --to=justin.iurman@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=tom@herbertland.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox