All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: "Yurij M. Plotnikov" <Yurij.Plotnikov@oktetlabs.ru>,
	netdev@vger.kernel.org,
	"Alexandra N. Kossovsky" <Alexandra.Kossovsky@oktetlabs.ru>
Subject: Re: PMTU discovery is broken on kernel 3.7.1 for UDP sockets
Date: Thu, 20 Dec 2012 08:34:46 +0100	[thread overview]
Message-ID: <20121220073445.GM18940@secunet.com> (raw)
In-Reply-To: <1355945864.2676.21.camel@bwh-desktop.uk.solarflarecom.com>

On Wed, Dec 19, 2012 at 07:37:44PM +0000, Ben Hutchings wrote:
> On Wed, 2012-12-19 at 18:27 +0400, Yurij M. Plotnikov wrote:
> > On 12/19/12 17:35, Ben Hutchings wrote:
> > > On Wed, 2012-12-19 at 17:10 +0400, Yurij M. Plotnikov wrote:
> > >    
> > >> On kernel 3.7.1 I get strange behaviour of IP_MTU_DISCOVER socket
> > >> option. The behaviour in case of IP_PMTUDISC_DO and IP_PMTUDISC_WANT
> > >> values of IP_MTU_DISCOVER socket option on SOCK_DGRAM socket are the
> > >> same and packet is always sent with "Don't Fragment" bit in case of
> > >> IP_PMTUDISC_WANT. Also, the value of IP_MTU socket option is not updated.
> > >>      
> > > You could try reverting:
> > >
> > > commit ee9a8f7ab2edf801b8b514c310455c94acc232f6
> > > Author: Steffen Klassert<steffen.klassert@secunet.com>
> > > Date:   Mon Oct 8 00:56:54 2012 +0000
> > >
> > >      ipv4: Don't report stale pmtu values to userspace
> > >
> > >      We report cached pmtu values even if they are already expired.
> > >      Change this to not report these values after they are expired
> > >      and fix a race in the expire time calculation, as suggested by
> > >      Eric Dumazet.
> > >
> > > Still, PMTU information is not supposed to expire for 10 minutes...
> > >
> > >    
> > With reverted commit there is no such problem on 3.7.1: IP_MTU is 
> > updated and DF is set only for the first packet in case of 
> > IP_PMTUDISC_WANT.
> [...]
> 
> So it looks like something is going wrong with the expiry calculation
> here.
> 
> This change shouldn't affect the PMTU actually used by the kernel, but
> could affect Onload since that relies on netlink route updates to keep
> in synch.  You didn't say you were using Onload, but if you are then we
> should not bother netdev with this until we can demonstrate a problem
> that involves only the kernel stack.
> 

I'm really surprised that this change can have such an effect,
it changes nothing at the kernels pmtu handling. When looking
at the code, I found that we may report a mtu value from a stale
dst_entry when we query the mtu value with the IP_MTU socket
option. But a subsequent send() should update the socket cached
dst_entry, so at most one packet should be affected.

Does the patch below change anything?


diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 3c9d208..1049ce0 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -1198,7 +1198,7 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname,
 	{
 		struct dst_entry *dst;
 		val = 0;
-		dst = sk_dst_get(sk);
+		dst = sk_dst_check(sk, 0);
 		if (dst) {
 			val = dst_mtu(dst);
 			dst_release(dst);

  parent reply	other threads:[~2012-12-20  7:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-19 13:10 PMTU discovery is broken on kernel 3.7.1 for UDP sockets Yurij M. Plotnikov
2012-12-19 13:35 ` Ben Hutchings
2012-12-19 14:27   ` Yurij M. Plotnikov
2012-12-19 19:37     ` Ben Hutchings
2012-12-20  7:14       ` Yurij M. Plotnikov
2012-12-20  7:34       ` Steffen Klassert [this message]
2012-12-20 11:22         ` Yurij M. Plotnikov
2012-12-20 12:35           ` Steffen Klassert
2012-12-21 10:22             ` Steffen Klassert
2013-01-14  8:26               ` Yurij M. Plotnikov
2013-01-14 12:52                 ` Steffen Klassert
2013-01-18  8:11                 ` Steffen Klassert
2013-01-18  8:14                   ` [RFC PATCH 1/3] ipv4: Invalidate the socket cached route on pmtu events if possible Steffen Klassert
2013-01-18 19:38                     ` David Miller
2013-01-19  0:54                     ` Julian Anastasov
2013-01-21  6:43                       ` Steffen Klassert
2013-01-18  8:15                   ` [RFC PATCH 2/3] ipv4: Add a socket release callback for datagram sockets Steffen Klassert
2013-01-18 19:39                     ` David Miller
2013-01-18  8:16                   ` [RFC PATCH 3/3] xfrm4: Invalidate all ipv4 routes on IPsec pmtu events Steffen Klassert
2013-01-18 19:39                     ` David Miller
2013-01-21  6:48                       ` Steffen Klassert
2013-01-21 12:04                       ` Steffen Klassert
2013-01-21 11:31                   ` PMTU discovery is broken on kernel 3.7.1 for UDP sockets Yurij M. Plotnikov
2013-01-21 11:38                     ` Steffen Klassert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121220073445.GM18940@secunet.com \
    --to=steffen.klassert@secunet.com \
    --cc=Alexandra.Kossovsky@oktetlabs.ru \
    --cc=Yurij.Plotnikov@oktetlabs.ru \
    --cc=bhutchings@solarflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.