stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Xiumei Mu <xmu@redhat.com>,
	Stefano Brivio <sbrivio@redhat.com>,
	David Ahern <dsahern@gmail.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.15 16/47] ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes
Date: Thu, 29 Mar 2018 19:59:57 +0200	[thread overview]
Message-ID: <20180329175730.282656709@linuxfoundation.org> (raw)
In-Reply-To: <20180329175729.225211114@linuxfoundation.org>

4.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Stefano Brivio <sbrivio@redhat.com>


[ Upstream commit e9fa1495d738e34fcec88a3d2ec9101a9ee5b310 ]

Currently, administrative MTU changes on a given netdevice are
not reflected on route exceptions for MTU-less routes, with a
set PMTU value, for that device:

 # ip -6 route get 2001:db8::b
 2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium
 # ping6 -c 1 -q -s10000 2001:db8::b > /dev/null
 # ip netns exec a ip -6 route get 2001:db8::b
 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
     cache expires 571sec mtu 4926 pref medium
 # ip link set dev vti_a mtu 3000
 # ip -6 route get 2001:db8::b
 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
     cache expires 571sec mtu 4926 pref medium
 # ip link set dev vti_a mtu 9000
 # ip -6 route get 2001:db8::b
 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0
     cache expires 571sec mtu 4926 pref medium

The first issue is that since commit fb56be83e43d ("net-ipv6: on
device mtu change do not add mtu to mtu-less routes") we don't
call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(),
which handles administrative MTU changes, if the regular route
is MTU-less.

However, PMTU exceptions should be always updated, as long as
RTAX_MTU is not locked. Keep the check for MTU-less main route,
as introduced by that commit, but, for exceptions,
call rt6_exceptions_update_pmtu() regardless of that check.

Once that is fixed, one problem remains: MTU changes are not
reflected if the new MTU is higher than the previous one,
because rt6_exceptions_update_pmtu() doesn't allow that. We
should instead allow PMTU increase if the old PMTU matches the
local MTU, as that implies that the old MTU was the lowest in the
path, and PMTU discovery might lead to different results.

The existing check in rt6_mtu_change_route() correctly took that
case into account (for regular routes only), so factor it out
and re-use it also in rt6_exceptions_update_pmtu().

While at it, fix comments style and grammar, and try to be a bit
more descriptive.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: fb56be83e43d ("net-ipv6: on device mtu change do not add mtu to mtu-less routes")
Fixes: f5bbe7ee79c2 ("ipv6: prepare rt6_mtu_change() for exception table")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/route.c |   71 ++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 29 deletions(-)

--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1510,7 +1510,30 @@ static void rt6_exceptions_remove_prefsr
 	}
 }
 
-static void rt6_exceptions_update_pmtu(struct rt6_info *rt, int mtu)
+static bool rt6_mtu_change_route_allowed(struct inet6_dev *idev,
+					 struct rt6_info *rt, int mtu)
+{
+	/* If the new MTU is lower than the route PMTU, this new MTU will be the
+	 * lowest MTU in the path: always allow updating the route PMTU to
+	 * reflect PMTU decreases.
+	 *
+	 * If the new MTU is higher, and the route PMTU is equal to the local
+	 * MTU, this means the old MTU is the lowest in the path, so allow
+	 * updating it: if other nodes now have lower MTUs, PMTU discovery will
+	 * handle this.
+	 */
+
+	if (dst_mtu(&rt->dst) >= mtu)
+		return true;
+
+	if (dst_mtu(&rt->dst) == idev->cnf.mtu6)
+		return true;
+
+	return false;
+}
+
+static void rt6_exceptions_update_pmtu(struct inet6_dev *idev,
+				       struct rt6_info *rt, int mtu)
 {
 	struct rt6_exception_bucket *bucket;
 	struct rt6_exception *rt6_ex;
@@ -1519,20 +1542,22 @@ static void rt6_exceptions_update_pmtu(s
 	bucket = rcu_dereference_protected(rt->rt6i_exception_bucket,
 					lockdep_is_held(&rt6_exception_lock));
 
-	if (bucket) {
-		for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
-			hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
-				struct rt6_info *entry = rt6_ex->rt6i;
-				/* For RTF_CACHE with rt6i_pmtu == 0
-				 * (i.e. a redirected route),
-				 * the metrics of its rt->dst.from has already
-				 * been updated.
-				 */
-				if (entry->rt6i_pmtu && entry->rt6i_pmtu > mtu)
-					entry->rt6i_pmtu = mtu;
-			}
-			bucket++;
+	if (!bucket)
+		return;
+
+	for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
+		hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
+			struct rt6_info *entry = rt6_ex->rt6i;
+
+			/* For RTF_CACHE with rt6i_pmtu == 0 (i.e. a redirected
+			 * route), the metrics of its rt->dst.from have already
+			 * been updated.
+			 */
+			if (entry->rt6i_pmtu &&
+			    rt6_mtu_change_route_allowed(idev, entry, mtu))
+				entry->rt6i_pmtu = mtu;
 		}
+		bucket++;
 	}
 }
 
@@ -3521,25 +3546,13 @@ static int rt6_mtu_change_route(struct r
 	   Since RFC 1981 doesn't include administrative MTU increase
 	   update PMTU increase is a MUST. (i.e. jumbo frame)
 	 */
-	/*
-	   If new MTU is less than route PMTU, this new MTU will be the
-	   lowest MTU in the path, update the route PMTU to reflect PMTU
-	   decreases; if new MTU is greater than route PMTU, and the
-	   old MTU is the lowest MTU in the path, update the route PMTU
-	   to reflect the increase. In this case if the other nodes' MTU
-	   also have the lowest MTU, TOO BIG MESSAGE will be lead to
-	   PMTU discovery.
-	 */
 	if (rt->dst.dev == arg->dev &&
-	    dst_metric_raw(&rt->dst, RTAX_MTU) &&
 	    !dst_metric_locked(&rt->dst, RTAX_MTU)) {
 		spin_lock_bh(&rt6_exception_lock);
-		if (dst_mtu(&rt->dst) >= arg->mtu ||
-		    (dst_mtu(&rt->dst) < arg->mtu &&
-		     dst_mtu(&rt->dst) == idev->cnf.mtu6)) {
+		if (dst_metric_raw(&rt->dst, RTAX_MTU) &&
+		    rt6_mtu_change_route_allowed(idev, rt, arg->mtu))
 			dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu);
-		}
-		rt6_exceptions_update_pmtu(rt, arg->mtu);
+		rt6_exceptions_update_pmtu(idev, rt, arg->mtu);
 		spin_unlock_bh(&rt6_exception_lock);
 	}
 	return 0;

  parent reply	other threads:[~2018-03-29 18:02 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-29 17:59 [PATCH 4.15 00/47] 4.15.15-stable review Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 01/47] net: dsa: Fix dsa_is_user_port() test inversion Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 02/47] openvswitch: meter: fix the incorrect calculation of max delta_t Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 03/47] qed: Fix MPA unalign flow in case header is split across two packets Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 04/47] tcp: purge write queue upon aborting the connection Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 05/47] qed: Fix non TCP packets should be dropped on iWARP ll2 connection Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 06/47] sysfs: symlink: export sysfs_create_link_nowarn() Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 07/47] net: phy: relax error checking when creating sysfs link netdev->phydev Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 08/47] devlink: Remove redundant free on error path Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 09/47] macvlan: filter out unsupported feature flags Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 10/47] net: ipv6: keep sk status consistent after datagram connect failure Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 11/47] ipv6: old_dport should be a __be16 in __ip6_datagram_connect() Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 12/47] ipv6: sr: fix NULL pointer dereference when setting encap source address Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 13/47] ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 14/47] mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 15/47] net: phy: Tell caller result of phy_change() Greg Kroah-Hartman
2018-03-29 17:59 ` Greg Kroah-Hartman [this message]
2018-03-29 17:59 ` [PATCH 4.15 17/47] net sched actions: return explicit error when tunnel_key mode is not specified Greg Kroah-Hartman
2018-03-29 17:59 ` [PATCH 4.15 18/47] ppp: avoid loop in xmit recursion detection code Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 19/47] rhashtable: Fix rhlist duplicates insertion Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 20/47] test_rhashtable: add test case for rhltable with duplicate objects Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 21/47] kcm: lock lower socket in kcm_attach Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 22/47] sch_netem: fix skb leak in netem_enqueue() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 23/47] ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 24/47] net: use skb_to_full_sk() in skb_update_prio() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 25/47] net: Fix hlist corruptions in inet_evict_bucket() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 26/47] s390/qeth: free netdevice when removing a card Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 27/47] s390/qeth: when thread completes, wake up all waiters Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 28/47] s390/qeth: lock read device while queueing next buffer Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 29/47] s390/qeth: on channel error, reject further cmd requests Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 30/47] soc/fsl/qbman: fix issue in qman_delete_cgr_safe() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 31/47] dpaa_eth: fix error in dpaa_remove() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 32/47] dpaa_eth: remove duplicate initialization Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 33/47] dpaa_eth: increment the RX dropped counter when needed Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 34/47] dpaa_eth: remove duplicate increment of the tx_errors counter Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 35/47] dccp: check sk for closed state in dccp_sendmsg() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 36/47] ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 37/47] l2tp: do not accept arbitrary sockets Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 38/47] net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 40/47] net: fec: Fix unbalanced PM runtime calls Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 41/47] net/iucv: Free memory obtained by kzalloc Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 42/47] netlink: avoid a double skb free in genlmsg_mcast() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 43/47] net: Only honor ifindex in IP_PKTINFO if non-0 Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 44/47] net: systemport: Rewrite __bcm_sysport_tx_reclaim() Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 45/47] qede: Fix qedr link update Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 46/47] skbuff: Fix not waking applications when errors are enqueued Greg Kroah-Hartman
2018-03-29 18:00 ` [PATCH 4.15 47/47] team: Fix double free in error path Greg Kroah-Hartman
2018-03-29 23:09 ` [PATCH 4.15 00/47] 4.15.15-stable review Shuah Khan
2018-03-30  9:02   ` Greg Kroah-Hartman
2018-03-30  8:11 ` Naresh Kamboju
2018-03-30  9:08   ` Greg Kroah-Hartman
2018-03-30 15:20 ` Guenter Roeck
2018-03-31  7:22   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180329175730.282656709@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sbrivio@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=xmu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).