* problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
@ 2014-04-28 16:37 Maxime Bizon
2014-04-28 17:59 ` Maxime Bizon
2014-04-29 14:33 ` Patrick McHardy
0 siblings, 2 replies; 10+ messages in thread
From: Maxime Bizon @ 2014-04-28 16:37 UTC (permalink / raw)
To: Patrick McHardy, Eric Dumazet; +Cc: davem, netdev
Hello Patrick & Eric,
After upgrading a router with a kernel that has patch 5f2d04f1f9 (ipv4:
fix path MTU discovery with connection tracking), some packets are not
forwarded anymore.
(note: kernel is 3.11.10, and conntrack is enabled)
Offending packets are IP fragments with DF bit set, MTU is the same on
both interfaces involved in forwarding. All received fragments are
(obviously) below MTU. The resulting packet after re-assembly is however
above MTU.
conntrack causes the packets to be re-assembled, but since the resulting
skb now has IP_DF set, it fails the (DF + MTU) test in ip_forward.c and
causes ICMP frag_needed to be sent.
Without the patch, the packet was (re-)fragmented in the output path
(but as the patch says, breaking PMTUD because original fragment size is
not considered).
--
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-28 16:37 problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking) Maxime Bizon
@ 2014-04-28 17:59 ` Maxime Bizon
2014-04-29 14:35 ` Patrick McHardy
2014-04-29 14:33 ` Patrick McHardy
1 sibling, 1 reply; 10+ messages in thread
From: Maxime Bizon @ 2014-04-28 17:59 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Eric Dumazet, davem, netdev
On Mon, 2014-04-28 at 18:37 +0200, Maxime Bizon wrote:
>
> conntrack causes the packets to be re-assembled, but since the
> resulting skb now has IP_DF set, it fails the (DF + MTU) test in
> ip_forward.c and causes ICMP frag_needed to be sent.
hum isn't it just a matter of doing this ?
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -493,6 +493,8 @@ found:
skb->len + ihl > qp->q.max_size)
qp->q.max_size = skb->len + ihl;
+ skb->local_df = 1;
+
if (qp->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
qp->q.meat == qp->q.len) {
unsigned long orefdst = skb->_skb_refdst;
--
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-28 17:59 ` Maxime Bizon
@ 2014-04-29 14:35 ` Patrick McHardy
0 siblings, 0 replies; 10+ messages in thread
From: Patrick McHardy @ 2014-04-29 14:35 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Eric Dumazet, davem, netdev
On Mon, Apr 28, 2014 at 07:59:40PM +0200, Maxime Bizon wrote:
>
> On Mon, 2014-04-28 at 18:37 +0200, Maxime Bizon wrote:
> >
> > conntrack causes the packets to be re-assembled, but since the
> > resulting skb now has IP_DF set, it fails the (DF + MTU) test in
> > ip_forward.c and causes ICMP frag_needed to be sent.
>
> hum isn't it just a matter of doing this ?
But why should we do this? The entire point of the patch was to fix PMTUD.
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -493,6 +493,8 @@ found:
> skb->len + ihl > qp->q.max_size)
> qp->q.max_size = skb->len + ihl;
>
> + skb->local_df = 1;
> +
> if (qp->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
> qp->q.meat == qp->q.len) {
> unsigned long orefdst = skb->_skb_refdst;
>
>
> --
> Maxime
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-28 16:37 problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking) Maxime Bizon
2014-04-28 17:59 ` Maxime Bizon
@ 2014-04-29 14:33 ` Patrick McHardy
2014-04-29 14:42 ` Maxime Bizon
1 sibling, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2014-04-29 14:33 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Eric Dumazet, davem, netdev
On Mon, Apr 28, 2014 at 06:37:36PM +0200, Maxime Bizon wrote:
>
> Hello Patrick & Eric,
>
> After upgrading a router with a kernel that has patch 5f2d04f1f9 (ipv4:
> fix path MTU discovery with connection tracking), some packets are not
> forwarded anymore.
>
> (note: kernel is 3.11.10, and conntrack is enabled)
>
> Offending packets are IP fragments with DF bit set, MTU is the same on
> both interfaces involved in forwarding. All received fragments are
> (obviously) below MTU. The resulting packet after re-assembly is however
> above MTU.
>
> conntrack causes the packets to be re-assembled, but since the resulting
> skb now has IP_DF set, it fails the (DF + MTU) test in ip_forward.c and
> causes ICMP frag_needed to be sent.
That is the correct behaviour.
> Without the patch, the packet was (re-)fragmented in the output path
> (but as the patch says, breaking PMTUD because original fragment size is
> not considered).
Yes. The sending host should receive the ICMP frag needed and adjust it's
size accordingly.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 14:33 ` Patrick McHardy
@ 2014-04-29 14:42 ` Maxime Bizon
2014-04-29 14:45 ` Patrick McHardy
0 siblings, 1 reply; 10+ messages in thread
From: Maxime Bizon @ 2014-04-29 14:42 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Eric Dumazet, davem, netdev
On Tue, 2014-04-29 at 15:33 +0100, Patrick McHardy wrote:
> That is the correct behaviour.
is it ?
say the remote host is sending two 1000 bytes fragments, with DF set to
me and my MTU is 1500, I'm suppose to route these packets to an
interface with the same MTU.
without conntrack both fragments are correctly forwarded
with conntrack, the original packet is re-assembled, then instead of
forwarding it, we send a "frag needed mtu = 1500" to the remote host,
which never sent any packets bigger than 1500 at the beginning.
--
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 14:42 ` Maxime Bizon
@ 2014-04-29 14:45 ` Patrick McHardy
2014-04-29 15:23 ` Maxime Bizon
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2014-04-29 14:45 UTC (permalink / raw)
To: Maxime Bizon; +Cc: Eric Dumazet, davem, netdev
On Tue, Apr 29, 2014 at 04:42:15PM +0200, Maxime Bizon wrote:
>
> On Tue, 2014-04-29 at 15:33 +0100, Patrick McHardy wrote:
>
> > That is the correct behaviour.
>
> is it ?
>
> say the remote host is sending two 1000 bytes fragments, with DF set to
> me and my MTU is 1500, I'm suppose to route these packets to an
> interface with the same MTU.
>
> without conntrack both fragments are correctly forwarded
>
> with conntrack, the original packet is re-assembled, then instead of
> forwarding it, we send a "frag needed mtu = 1500" to the remote host,
> which never sent any packets bigger than 1500 at the beginning.
Right, that is not correct of course. We save the original packet size
and should either refragment to that size or send an ICMP frag required
if the original size exceeds the outgoing MTU.
So your patch does look correct, however we should probably only set
local_df in conntrack defrag.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 14:45 ` Patrick McHardy
@ 2014-04-29 15:23 ` Maxime Bizon
2014-04-29 15:37 ` Eric Dumazet
0 siblings, 1 reply; 10+ messages in thread
From: Maxime Bizon @ 2014-04-29 15:23 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Eric Dumazet, davem, netdev
On Tue, 2014-04-29 at 15:45 +0100, Patrick McHardy wrote:
> Right, that is not correct of course. We save the original packet size
> and should either refragment to that size or send an ICMP frag required
> if the original size exceeds the outgoing MTU.
yep indeed, if any fragment is bigger than outgoing MTU
> So your patch does look correct, however we should probably only set
> local_df in conntrack defrag.
I thought of putting it in conntrack code first, but I had some doubts.
There are a lot of possible paths that can be taken after reassembly
(ipsec, tunnel, ...), and I did not audit all of them.
Since the patch that caused this regression modified the generic
fragmentation code, it may have caused other silent breakage.
Eric/David, could I get your feeling about this ?
Thanks,
--
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 15:23 ` Maxime Bizon
@ 2014-04-29 15:37 ` Eric Dumazet
2014-04-29 20:13 ` Florian Westphal
2014-04-30 14:34 ` Maxime Bizon
0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2014-04-29 15:37 UTC (permalink / raw)
To: mbizon
Cc: Patrick McHardy, Eric Dumazet, davem, netdev,
Maciej Żenczykowski
On Tue, 2014-04-29 at 17:23 +0200, Maxime Bizon wrote:
> On Tue, 2014-04-29 at 15:45 +0100, Patrick McHardy wrote:
>
> > Right, that is not correct of course. We save the original packet size
> > and should either refragment to that size or send an ICMP frag required
> > if the original size exceeds the outgoing MTU.
>
> yep indeed, if any fragment is bigger than outgoing MTU
>
> > So your patch does look correct, however we should probably only set
> > local_df in conntrack defrag.
>
> I thought of putting it in conntrack code first, but I had some doubts.
> There are a lot of possible paths that can be taken after reassembly
> (ipsec, tunnel, ...), and I did not audit all of them.
>
> Since the patch that caused this regression modified the generic
> fragmentation code, it may have caused other silent breakage.
>
> Eric/David, could I get your feeling about this ?
Hmm... maybe its finally time to rename local_df to ignore_df, as
Maze suggested some time ago ...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 15:37 ` Eric Dumazet
@ 2014-04-29 20:13 ` Florian Westphal
2014-04-30 14:34 ` Maxime Bizon
1 sibling, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-04-29 20:13 UTC (permalink / raw)
To: Eric Dumazet
Cc: mbizon, Patrick McHardy, Eric Dumazet, davem, netdev,
Maciej Żenczykowski
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hmm... maybe its finally time to rename local_df to ignore_df, as
> Maze suggested some time ago ...
Yes please.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking)
2014-04-29 15:37 ` Eric Dumazet
2014-04-29 20:13 ` Florian Westphal
@ 2014-04-30 14:34 ` Maxime Bizon
1 sibling, 0 replies; 10+ messages in thread
From: Maxime Bizon @ 2014-04-30 14:34 UTC (permalink / raw)
To: Eric Dumazet
Cc: Patrick McHardy, Eric Dumazet, davem, netdev,
Maciej Żenczykowski
On Tue, 2014-04-29 at 08:37 -0700, Eric Dumazet wrote:
> Hmm... maybe its finally time to rename local_df to ignore_df, as
> Maze suggested some time ago ...
It would have indeed helped me to understand the purpose of this flag at
first glance.
About the patch I sent, I'm not well equipped to test mainline kernels
wrt the bug I reported (it does not boot on my embedded device), so I'd
prefer someone else to cook the patch and actually test it.
Thanks,
--
Maxime
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-04-30 14:34 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-28 16:37 problem forwarding IP fragments with DF bit set (caused by ipv4: fix path MTU discovery with connection tracking) Maxime Bizon
2014-04-28 17:59 ` Maxime Bizon
2014-04-29 14:35 ` Patrick McHardy
2014-04-29 14:33 ` Patrick McHardy
2014-04-29 14:42 ` Maxime Bizon
2014-04-29 14:45 ` Patrick McHardy
2014-04-29 15:23 ` Maxime Bizon
2014-04-29 15:37 ` Eric Dumazet
2014-04-29 20:13 ` Florian Westphal
2014-04-30 14:34 ` Maxime Bizon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).