From: Simon Horman <horms@kernel.org>
To: angelayucode@gmail.com
Cc: Simon Horman <horms@kernel.org>,
dsahern@kernel.or, blair.steven@alliedtelesis.co.nz,
carl.smith@alliedtelesis.co.nz, linux-kernel@vger.kernel.org,
angela.yu@alliedtelesis.co.nz, davem@davemloft.net,
dsahern@kernel.org, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, netdev@vger.kernel.org
Subject: Re: [PATCH 1/1] [PATCH] l2tp: fix IPv6 xmit frag failure for oversized non-GSO packets
Date: Fri, 12 Jun 2026 17:42:14 +0100 [thread overview]
Message-ID: <20260612164213.694784-2-horms@kernel.org> (raw)
In-Reply-To: <20260610223557.2765487-1-angela.yu@alliedtelesis.co.nz>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
l2tp: fix IPv6 xmit frag failure for oversized non-GSO packets
This patch sets skb->ignore_df from ip6_sk_ignore_df(sk) inside ip6_xmit()
so that oversized non-GSO L2TPv3-over-IPv6 outer packets follow the
fragmentation path instead of being dropped with EMSGSIZE.
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index c14adcdd4396..f5370554cfb5 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -359,6 +359,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
> skb->protocol = htons(ETH_P_IPV6);
> skb->priority = priority;
> skb->mark = mark;
> + skb->ignore_df = ip6_sk_ignore_df(sk);
[High]
Can this unconditional assignment clobber an skb->ignore_df that the
caller deliberately set before invoking ip6_xmit()?
For example, sctp_v6_xmit() in net/sctp/ipv6.c does:
if (!(t->param_flags & SPP_PMTUD_ENABLE))
skb->ignore_df = 1;
...
res = ip6_xmit(sk, skb, fl6, sk->sk_mark,
rcu_dereference(np->opt),
tclass, READ_ONCE(sk->sk_priority));
If the SCTP socket has pmtudisc set to IPV6_PMTUDISC_DO, _PROBE, or
_INTERFACE via setsockopt(IPV6_MTU_DISCOVER), ip6_sk_ignore_df() returns
false:
static inline bool ip6_sk_ignore_df(const struct sock *sk)
{
u8 pmtudisc = READ_ONCE(inet6_sk(sk)->pmtudisc);
return pmtudisc < IPV6_PMTUDISC_DO ||
pmtudisc == IPV6_PMTUDISC_OMIT;
}
In that combination, won't the new write reset ignore_df back to 0 and
defeat the per-transport SPP_PMTUD_ENABLE-off setting, sending the
packet down the EMSGSIZE / IPSTATS_MIB_FRAGFAILS branch that the patch
is trying to avoid?
The same concern applies to L2TP itself. l2tp_xmit_queue() in
net/l2tp/l2tp_core.c does:
skb->ignore_df = 1;
skb_dst_drop(skb);
if (l2tp_sk_is_v6(tunnel->sock))
err = inet6_csk_xmit(tunnel->sock, skb, NULL);
inet6_csk_xmit() does not touch skb->ignore_df before reaching
ip6_xmit(). If the tunnel socket is configured with
IPV6_PMTUDISC_DO, doesn't the new line in ip6_xmit() also overwrite the
ignore_df=1 that l2tp_xmit_queue() just set?
Other ip6_xmit() callers (TCP via inet6_csk_xmit/tcp_v6_send_synack/
tcp_v6_send_response, XFRM NAT keepalive via nat_keepalive_send_ipv6)
inherit this behaviour change too. Would an OR-merge such as
skb->ignore_df = skb->ignore_df || ip6_sk_ignore_df(sk);
or fixing this in the offending L2TP path be more appropriate, so the
contract of ip6_xmit() with its other callers is preserved?
[Medium]
The commit message says:
For oversized non-GSO packets skb->ignore_df is not set, causing the
packet to fail the transmit checks and hit the
EMSGSIZE/IPSTATS_MIB_FRAGFAILS path instead of being fragments.
Is that accurate for the L2TPv3 data path? l2tp_xmit_queue() already
unconditionally does:
skb->ignore_df = 1;
...
err = inet6_csk_xmit(tunnel->sock, skb, NULL);
and inet6_csk_xmit() does not clear it before ip6_xmit(). With the
pre-patch test
if (likely((skb->len <= mtu) || skb->ignore_df || skb_is_gso(skb)))
ignore_df=1 should already short-circuit the MTU check on the data
path. Is the failure actually coming from a different path (for
instance an L2TPIP control-socket send that bypasses
l2tp_xmit_queue())? Could the changelog be updated to identify the
exact path that was hitting the EMSGSIZE branch?
> mtu = dst6_mtu(dst);
> if (likely((skb->len <= mtu) || skb->ignore_df || skb_is_gso(skb))) {
[Low]
After this change, ip6_xmit() and __ip_queue_xmit() have asymmetric
semantics for skb->ignore_df. The IPv4 helper only reads it:
if (ip_dont_fragment(sk, &rt->dst) && !skb->ignore_df)
while ip6_xmit() now writes it from the socket's pmtudisc, overriding
callers. Should the two paths be aligned, or should this divergence be
documented near ip6_xmit()?
Also, two small typos in the commit message:
the packet length does not excent the MTU
^^^^^^ exceed?
instead of being fragments.
^^^^^^^^^ fragmented?
transmited as intended.
^^^^^^^^^^ transmitted?
next prev parent reply other threads:[~2026-06-12 16:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 22:35 [PATCH 1/1] [PATCH] l2tp: fix IPv6 xmit frag failure for oversized non-GSO packets Angela Yu
2026-06-12 16:42 ` Simon Horman [this message]
2026-06-12 16:50 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260612164213.694784-2-horms@kernel.org \
--to=horms@kernel.org \
--cc=angela.yu@alliedtelesis.co.nz \
--cc=angelayucode@gmail.com \
--cc=blair.steven@alliedtelesis.co.nz \
--cc=carl.smith@alliedtelesis.co.nz \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.or \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox