netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: David Miller <davem@davemloft.net>
Cc: eric.dumazet@gmail.com, herbert@gondor.hengli.com.au,
	netdev@vger.kernel.org
Subject: Re: [PATCH 2/3] ipv4: Fix packet size calculation for IPsec packets in __ip_append_data
Date: Wed, 22 Jun 2011 13:02:19 +0200	[thread overview]
Message-ID: <20110622110219.GF6489@secunet.com> (raw)
In-Reply-To: <20110609.144704.1705676110990397845.davem@davemloft.net>

On Thu, Jun 09, 2011 at 02:47:04PM -0700, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Wed, 8 Jun 2011 07:30:20 +0200
> 
> > In between I can confirm that we get the slowpath problem back with my
> > patch, so we still have a bug somewhere. Reverting your commit would
> > be just a band aid. I think it is better to find the bug and do a real
> > fix instead. Unfortunatly I fear I'm not able to track it down before
> > my vacation that starts tomorrow. I'll continue to work at it once I'm
> > back...
> 
> Thanks Steffen, looking forward to this.

In between I found the problem. In ip_setup_cork() we take the mtu on the
base of dst_mtu(rt->dst.path) and assign it to cork->fragsize which is
used as the mtu in __ip_append_data(). The path dst_entry is a routing
dst_entry that does not take the IPsec header and trailer overhead into
account. So if we build an IPsec packet based on this mtu it might be to
big after the IPsec transformations are applied. If we take the actual
IPsec mtu from dst_mtu(&rt->dst) and adapt the exthdr handling to this,
it works as expected. So I'll send two patches, one that reverts Eric's
patch and one that fixes the slowpath issue.

While reading through the code of __ip_append_data() I noticed that we
might use ip_ufo_append_data() for packets that will be IPsec transformed
later, is this ok? I don't know how ufo handling works, but I would guess
that it expects an udp header and not an IPsec header as the packets
transport header.

I found another funny thing when I tested my patches:

I use an IPsec tunnel with tunnel endpoints 192.168.1.1 and 192.168.1.2

Then I do at 192.168.1.2

ping -c20 -M do -s 1500 192.168.1.1

PING 192.168.1.1 (192.168.1.1) 1500(1528) bytes of data.
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1438)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1438)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1374)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1310)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1246)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1182)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1118)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1054)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 990)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 926)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 862)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 798)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 734)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 670)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 606)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 552)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 494)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 494)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 494)
>From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 494)

--- 192.168.1.1 ping statistics ---
0 packets transmitted, 0 received, +20 errors

These packets are locally generated and not from 192.168.1.1 as they
claim to be. I think the problem is the following:

The IPsec mtu is 1438 here, so the first packet is too big.
xfrm4_tunnel_check_size() notices this and sends a ICMP_FRAG_NEEDED
packet that announces a mtu of 1438 to the original sender of the ping
packet. Unfortunately the sender is a local address, it's the IPsec
tunnel entry point. So we update the mtu for this connection to 1438.
Now, with the next packet xfrm_bundle_ok() notices that the path mtu has
changed, so it subtracts the IPsec overhead from the mtu a second time
and we end up with a mtu of 1374. This game goes until we reach a minimal
mtu of 494.

Unfortunately I don't know how to fix this. Any ideas?


  reply	other threads:[~2011-06-22 11:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-06  6:46 [PATCH 1/3] xfrm: Fix off by one in the replay advance functions Steffen Klassert
2011-06-06  6:48 ` [PATCH 2/3] ipv4: Fix packet size calculation for IPsec packets in __ip_append_data Steffen Klassert
2011-06-06  7:38   ` Eric Dumazet
2011-06-06  8:52     ` Steffen Klassert
2011-06-07  5:06       ` Eric Dumazet
2011-06-08  5:30         ` Steffen Klassert
2011-06-09 21:47           ` David Miller
2011-06-22 11:02             ` Steffen Klassert [this message]
2011-06-28  3:39               ` David Miller
2011-06-30  9:06                 ` Steffen Klassert
2011-06-06  6:48 ` [PATCH 3/3] ipv4: Fix packet size calculation for raw " Steffen Klassert
2011-06-09 21:50   ` David Miller
2011-06-08  4:15 ` [PATCH 1/3] xfrm: Fix off by one in the replay advance functions David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110622110219.GF6489@secunet.com \
    --to=steffen.klassert@secunet.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=herbert@gondor.hengli.com.au \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).