From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Ahern Subject: Re: PMTU discovery broken in Linux for UDP/raw application if the socket is not bound to a device Date: Sat, 6 Oct 2018 21:28:57 -0600 Message-ID: References: <811A4496-322E-443D-9B07-46E6BC406C38@juniper.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Reji Thomas , Yogesh Ankolekar , "netdev@vger.kernel.org" To: Preethi Ramachandra Return-path: Received: from mail-io1-f68.google.com ([209.85.166.68]:42304 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726236AbeJGKeo (ORCPT ); Sun, 7 Oct 2018 06:34:44 -0400 Received: by mail-io1-f68.google.com with SMTP id n18-v6so13479170ioa.9 for ; Sat, 06 Oct 2018 20:29:02 -0700 (PDT) In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: The correct mailing list is netdev@vger.kernel.org (added) non-text emails will be rejected. On 10/3/18 10:15 PM, Preethi Ramachandra wrote: > Hi, > >   > > While testing the PMTU discovery for UDP/raw applications, Linux is not > doing PMTU discovery if the UDP server socket is not bound to a device. >  In the scenario we are testing there could be multiple VRF devices > created and an application like UDP/RAW can use a common socket for all > vrf devices. While sending packet IP_PKTINFO socket option can be used > to specify the vrf interface through which packet will be sent out. In > this case, when packet too big icmp6 error message comes back to Linux > on a vrf device, a route lookup is done on default routing-table(0) for > src/dst address which case, the route will not be found and packet is > dropped. If the route lookup happened with proper VRF device (packet’s > incoming index), the route lookup succeeds,  PMTU discovery is successful. > >   > > This might need a fix, please take a look. > >   > > *Linux version * > >   > > Linux  4.8.24 > >   > > *Code flow * > >   > > Linux code where it expects socket’s bound device in order for PMTU > discovery to happen. > > *void ip6_sk_update_pmtu*(struct sk_buff *skb, struct sock *sk, __be32 mtu) > > { > >                     struct dst_entry *dst; > >   > >                     ip6_update_pmtu(skb, sock_net(sk), mtu, > >                                                              > sk->sk_bound_dev_if, sk->sk_mark, sk->sk_uid);*<<<<< This is the point > where it expects socket’s sk_bound_dev_if to be set. In our testing this > is actually 0, since the socket is not really bound to a vrf device.* Try this based on top of tree for 4.19-next (whitespace damaged on paste so you'll need to manually apply and handle differences with 4.8): diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 6c1d817151ca..50b95b48b911 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2360,10 +2360,13 @@ EXPORT_SYMBOL_GPL(ip6_update_pmtu); void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu) { + int oif = sk->sk_bound_dev_if; struct dst_entry *dst; - ip6_update_pmtu(skb, sock_net(sk), mtu, - sk->sk_bound_dev_if, sk->sk_mark, sk->sk_uid); + if (!oif && skb->dev) + oif = l3mdev_master_ifindex(skb->dev); + + ip6_update_pmtu(skb, sock_net(sk), mtu, oif, sk->sk_mark, sk->sk_uid); dst = __sk_dst_get(sk); if (!dst || !dst->obsolete ||