netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 0/2] Mitigate the Issue of Expired Routes in Linux IPv6 Routing Tables
@ 2023-05-17  4:27 Kui-Feng Lee
  2023-05-17  4:27 ` [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers Kui-Feng Lee
  2023-05-17  4:27 ` [RFC PATCH net-next 2/2] net: Remove unused code and variables Kui-Feng Lee
  0 siblings, 2 replies; 6+ messages in thread
From: Kui-Feng Lee @ 2023-05-17  4:27 UTC (permalink / raw)
  To: netdev, ast, martin.lau, kernel-team; +Cc: Kui-Feng Lee

The size of a Linux IPv6 routing table can become a big problem if not
managed appropriately.  Now, Linux has a garbage collector to remove
expired routes periodically.  However, this may lead to a situation in
which the routing path is blocked for a long period due to an
excessive number of routes.

For example, years ago, there is a commit about "ICMPv6 Packet too big
messages".  The root cause is that malicious ICMPv6 packets were sent
back for every small packet sent to them. These packets have to
lookup/insert a new route, putting hosts under high stress due to
contention on a spinlock while one is stuck in fib6_run_gc().

Why Route Expires
=================

Users can add IPv6 routes with an expiration time manually. However,
the Neighbor Discovery protocol may also generate routes that can
expire.  For example, Router Advertisement (RA) messages may create a
default route with an expiration time. [RFC 4861] For IPv4, it is not
possible to set an expiration time for a route, and there is no RA, so
there is no need to worry about such issues.

Create Routes with Expires
==========================

You can create routes with expires with the  command.

For example,

    ip -6 route add 2001:b000:591::3 via fe80::5054:ff:fe12:3457 \ 
        dev enp0s3 expires 30

The route that has been generated will be deleted automatically in 30
seconds.

GC of FIB6
==========

The function called fib6_run_gc() is responsible for performing
garbage collection (GC) for the Linux IPv6 stack. It checks for the
expiration of every route by traversing the tries of routing
tables. The time taken to traverse a routing table increases with its
size. Holding the routing table lock during traversal is particularly
undesirable. Therefore, it is preferable to keep the lock for the
shortest possible duration.

Solution
========

The cause of the issue is keeping the routing table locked during the
traversal of large tries. To address this, the patchset eliminates
garbage collection that does the tries traversal and introduces
individual timers for each route that eventually expires.  Walking
trials are no longer necessary with the timers. Additionally, the time
required to handle a timer is consistent.

If the expiration time is long, the timer becomes less precise. The
drawback is that the longer the expiration time, the less accurate the
timer.

Kui-Feng Lee (2):
  net: Remove expired routes with separated timers.
  net: Remove unused code and variables.

 include/net/ip6_fib.h    |  21 ++---
 include/net/ip6_route.h  |   2 -
 include/net/netns/ipv6.h |   6 --
 net/ipv6/addrconf.c      |   8 +-
 net/ipv6/ip6_fib.c       | 160 ++++++++++++++++++---------------------
 net/ipv6/ndisc.c         |   4 +-
 net/ipv6/route.c         | 119 ++---------------------------
 7 files changed, 95 insertions(+), 225 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers.
  2023-05-17  4:27 [RFC PATCH net-next 0/2] Mitigate the Issue of Expired Routes in Linux IPv6 Routing Tables Kui-Feng Lee
@ 2023-05-17  4:27 ` Kui-Feng Lee
  2023-05-17 15:26   ` Jakub Kicinski
  2023-05-17  4:27 ` [RFC PATCH net-next 2/2] net: Remove unused code and variables Kui-Feng Lee
  1 sibling, 1 reply; 6+ messages in thread
From: Kui-Feng Lee @ 2023-05-17  4:27 UTC (permalink / raw)
  To: netdev, ast, martin.lau, kernel-team; +Cc: Kui-Feng Lee

FIB6 GC walks tries of fib6_tables to remove expired routes.  Walking a
tree can be expensive if the number of routes in a table is big.
Creating a separated timer for each route that can expire will avoid
this potential issue.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
---
 include/net/ip6_fib.h | 19 ++++------
 net/ipv6/addrconf.c   |  8 ++--
 net/ipv6/ip6_fib.c    | 88 ++++++++++++++++++++++++++++++++++++-------
 net/ipv6/ndisc.c      |  2 +-
 net/ipv6/route.c      |  6 +--
 5 files changed, 91 insertions(+), 32 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 05e6f756feaf..850995306718 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -161,6 +161,8 @@ struct fib6_nh {
 	struct rt6_exception_bucket __rcu *rt6i_exception_bucket;
 };
 
+struct fib6_info_timer;
+
 struct fib6_info {
 	struct fib6_table		*fib6_table;
 	struct fib6_info __rcu		*fib6_next;
@@ -179,6 +181,7 @@ struct fib6_info {
 
 	refcount_t			fib6_ref;
 	unsigned long			expires;
+	struct fib6_info_timer		*timer;
 	struct dst_metrics		*fib6_metrics;
 #define fib6_pmtu		fib6_metrics->metrics[RTAX_MTU-1]
 
@@ -247,18 +250,11 @@ static inline bool fib6_requires_src(const struct fib6_info *rt)
 	return rt->fib6_src.plen > 0;
 }
 
-static inline void fib6_clean_expires(struct fib6_info *f6i)
-{
-	f6i->fib6_flags &= ~RTF_EXPIRES;
-	f6i->expires = 0;
-}
+void fib6_clean_expires(struct fib6_info *f6i);
 
-static inline void fib6_set_expires(struct fib6_info *f6i,
-				    unsigned long expires)
-{
-	f6i->expires = expires;
-	f6i->fib6_flags |= RTF_EXPIRES;
-}
+void fib6_set_expires(struct net *net,
+		      struct fib6_info *f6i,
+		      unsigned long expires);
 
 static inline bool fib6_check_expired(const struct fib6_info *f6i)
 {
@@ -388,6 +384,7 @@ struct fib6_table {
 	struct inet_peer_base	tb6_peers;
 	unsigned int		flags;
 	unsigned int		fib_seq;
+	struct hlist_head	tb6_timer_hlist;
 #define RT6_TABLE_HAS_DFLT_ROUTER	BIT(0)
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 3797917237d0..13e2366613c4 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1254,7 +1254,8 @@ cleanup_prefix_route(struct inet6_ifaddr *ifp, unsigned long expires,
 			ip6_del_rt(dev_net(ifp->idev->dev), f6i, false);
 		else {
 			if (!(f6i->fib6_flags & RTF_EXPIRES))
-				fib6_set_expires(f6i, expires);
+				fib6_set_expires(dev_net(ifp->idev->dev),
+						 f6i, expires);
 			fib6_info_release(f6i);
 		}
 	}
@@ -2762,7 +2763,8 @@ void addrconf_prefix_rcv(struct net_device *dev, u8 *opt, int len, bool sllao)
 				rt = NULL;
 			} else if (addrconf_finite_timeout(rt_expires)) {
 				/* not infinity */
-				fib6_set_expires(rt, jiffies + rt_expires);
+				fib6_set_expires(net, rt,
+						 jiffies + rt_expires);
 			} else {
 				fib6_clean_expires(rt);
 			}
@@ -4723,7 +4725,7 @@ static int modify_prefix_route(struct inet6_ifaddr *ifp,
 		if (!expires)
 			fib6_clean_expires(f6i);
 		else
-			fib6_set_expires(f6i, expires);
+			fib6_set_expires(dev_net(ifp->idev->dev), f6i, expires);
 
 		fib6_info_release(f6i);
 	}
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 2438da5ff6da..8a10a0355816 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -25,6 +25,7 @@
 #include <linux/init.h>
 #include <linux/list.h>
 #include <linux/slab.h>
+#include <linux/timer.h>
 
 #include <net/ip.h>
 #include <net/ipv6.h>
@@ -54,6 +55,12 @@ struct fib6_cleaner {
 #define FWS_INIT FWS_L
 #endif
 
+struct fib6_info_timer {
+	struct timer_list timer;
+	struct fib6_info *f6i;
+	struct net *net;
+};
+
 static struct fib6_info *fib6_find_prefix(struct net *net,
 					 struct fib6_table *table,
 					 struct fib6_node *fn);
@@ -144,6 +151,66 @@ static __be32 addr_bit_set(const void *token, int fn_bit)
 	       addr[fn_bit >> 5];
 }
 
+static void f6i_gc_timer_cb(struct timer_list *t)
+{
+	struct fib6_info_timer *timer;
+	struct nl_info info = {
+		.nlh = NULL,
+	};
+	struct fib6_info *f6i;
+	int res;
+
+	timer = from_timer(timer, t, timer);
+	info.nl_net = timer->net;
+	f6i = timer->f6i;
+	spin_lock(&f6i->fib6_table->tb6_lock);
+
+	res = fib6_del(f6i, &info);
+	if (res != 0) {
+#if RT6_DEBUG >= 2
+		pr_debug("%s: del failed: rt=%p@%p err=%d\n",
+			 __func__, f6i,
+			 rcu_access_pointer(f6i->fib6_node),
+			 res);
+#endif
+	}
+
+	spin_unlock(&f6i->fib6_table->tb6_lock);
+
+	fib6_info_release(f6i);
+}
+
+void fib6_clean_expires(struct fib6_info *f6i)
+{
+	f6i->fib6_flags &= ~RTF_EXPIRES;
+	f6i->expires = 0;
+	if (!f6i->timer)
+		return;
+	if (try_to_del_timer_sync(&f6i->timer->timer) == 1)
+		fib6_info_release(f6i);
+}
+
+void fib6_set_expires(struct net *net,struct fib6_info *f6i,
+		      unsigned long expires)
+{
+	f6i->expires = expires;
+	f6i->fib6_flags |= RTF_EXPIRES;
+	if (!f6i->timer) {
+		f6i->timer = kzalloc(sizeof(*f6i->timer), GFP_ATOMIC);
+		if (!f6i->timer) {
+			/* XXX: error handling */
+			panic("fib6_set_expires: kzalloc failed");
+			return;
+		}
+		f6i->timer->f6i = f6i;
+		f6i->timer->net = net;
+		timer_setup(&f6i->timer->timer, f6i_gc_timer_cb, 0);
+	}
+	fib6_info_hold(f6i);
+	if (mod_timer(&f6i->timer->timer, expires) == 1)
+		fib6_info_release(f6i);
+}
+
 struct fib6_info *fib6_info_alloc(gfp_t gfp_flags, bool with_fib6_nh)
 {
 	struct fib6_info *f6i;
@@ -175,6 +242,7 @@ void fib6_info_destroy_rcu(struct rcu_head *head)
 		fib6_nh_release(f6i->fib6_nh);
 
 	ip_fib_metrics_put(f6i->fib6_metrics);
+	kfree(f6i->timer);
 	kfree(f6i);
 }
 EXPORT_SYMBOL_GPL(fib6_info_destroy_rcu);
@@ -246,6 +314,7 @@ static struct fib6_table *fib6_alloc_table(struct net *net, u32 id)
 				   net->ipv6.fib6_null_entry);
 		table->tb6_root.fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO;
 		inet_peer_base_init(&table->tb6_peers);
+		INIT_HLIST_HEAD(&table->tb6_timer_hlist);
 	}
 
 	return table;
@@ -1120,7 +1189,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
 				if (!(rt->fib6_flags & RTF_EXPIRES))
 					fib6_clean_expires(iter);
 				else
-					fib6_set_expires(iter, rt->expires);
+					fib6_set_expires(info->nl_net,
+							 iter, rt->expires);
 
 				if (rt->fib6_pmtu)
 					fib6_metric_set(iter, RTAX_MTU,
@@ -2025,6 +2095,9 @@ int fib6_del(struct fib6_info *rt, struct nl_info *info)
 		if (rt == cur) {
 			if (fib6_requires_src(cur))
 				fib6_routes_require_src_dec(info->nl_net);
+			if (cur->timer &&
+			    try_to_del_timer_sync(&cur->timer->timer) == 1)
+				fib6_info_release(cur);
 			fib6_del_route(table, fn, rtp, info);
 			return 0;
 		}
@@ -2290,19 +2363,6 @@ static int fib6_age(struct fib6_info *rt, void *arg)
 	struct fib6_gc_args *gc_args = arg;
 	unsigned long now = jiffies;
 
-	/*
-	 *	check addrconf expiration here.
-	 *	Routes are expired even if they are in use.
-	 */
-
-	if (rt->fib6_flags & RTF_EXPIRES && rt->expires) {
-		if (time_after(now, rt->expires)) {
-			RT6_TRACE("expiring %p\n", rt);
-			return -1;
-		}
-		gc_args->more++;
-	}
-
 	/*	Also age clones in the exception table.
 	 *	Note, that clones are aged out
 	 *	only if they are not in use now.
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 18634ebd20a4..1d4cf7f73097 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1407,7 +1407,7 @@ static enum skb_drop_reason ndisc_router_discovery(struct sk_buff *skb)
 	}
 
 	if (rt)
-		fib6_set_expires(rt, jiffies + (HZ * lifetime));
+		fib6_set_expires(net, rt, jiffies + (HZ * lifetime));
 	if (in6_dev->cnf.accept_ra_min_hop_limit < 256 &&
 	    ra_msg->icmph.icmp6_hop_limit) {
 		if (in6_dev->cnf.accept_ra_min_hop_limit <= ra_msg->icmph.icmp6_hop_limit) {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e3aec46bd466..87721a2a91b6 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -990,7 +990,7 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len,
 		if (!addrconf_finite_timeout(lifetime))
 			fib6_clean_expires(rt);
 		else
-			fib6_set_expires(rt, jiffies + HZ * lifetime);
+			fib6_set_expires(net, rt, jiffies + HZ * lifetime);
 
 		fib6_info_release(rt);
 	}
@@ -3755,8 +3755,8 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
 		rt->dst_nocount = true;
 
 	if (cfg->fc_flags & RTF_EXPIRES)
-		fib6_set_expires(rt, jiffies +
-				clock_t_to_jiffies(cfg->fc_expires));
+		fib6_set_expires(net, rt, jiffies +
+				 clock_t_to_jiffies(cfg->fc_expires));
 	else
 		fib6_clean_expires(rt);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH net-next 2/2] net: Remove unused code and variables.
  2023-05-17  4:27 [RFC PATCH net-next 0/2] Mitigate the Issue of Expired Routes in Linux IPv6 Routing Tables Kui-Feng Lee
  2023-05-17  4:27 ` [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers Kui-Feng Lee
@ 2023-05-17  4:27 ` Kui-Feng Lee
  2023-05-17 14:56   ` Simon Horman
  1 sibling, 1 reply; 6+ messages in thread
From: Kui-Feng Lee @ 2023-05-17  4:27 UTC (permalink / raw)
  To: netdev, ast, martin.lau, kernel-team; +Cc: Kui-Feng Lee

Since GC has been removed, some functions and variables are useless.  That
includes some sysctl variables that control GC.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
---
 include/net/ip6_fib.h    |   2 -
 include/net/ip6_route.h  |   2 -
 include/net/netns/ipv6.h |   6 ---
 net/ipv6/ip6_fib.c       |  72 -------------------------
 net/ipv6/ndisc.c         |   2 -
 net/ipv6/route.c         | 113 ++-------------------------------------
 6 files changed, 4 insertions(+), 193 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 850995306718..5c7dea981e7a 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -495,8 +495,6 @@ void fib6_rt_update(struct net *net, struct fib6_info *rt,
 void inet6_rt_notify(int event, struct fib6_info *rt, struct nl_info *info,
 		     unsigned int flags);
 
-void fib6_run_gc(unsigned long expires, struct net *net, bool force);
-
 void fib6_gc_cleanup(void);
 
 int fib6_init(void);
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 3556595ce59a..d99c09ea1859 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -152,8 +152,6 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6,
 
 struct dst_entry *icmp6_dst_alloc(struct net_device *dev, struct flowi6 *fl6);
 
-void fib6_force_start_gc(struct net *net);
-
 struct fib6_info *addrconf_f6i_alloc(struct net *net, struct inet6_dev *idev,
 				     const struct in6_addr *addr, bool anycast,
 				     gfp_t gfp_flags);
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 3cceb3e9320b..64e7d3944637 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -20,12 +20,8 @@ struct netns_sysctl_ipv6 {
 	struct ctl_table_header *frags_hdr;
 	struct ctl_table_header *xfrm6_hdr;
 #endif
-	int flush_delay;
 	int ip6_rt_max_size;
-	int ip6_rt_gc_min_interval;
-	int ip6_rt_gc_timeout;
 	int ip6_rt_gc_interval;
-	int ip6_rt_gc_elasticity;
 	int ip6_rt_mtu_expires;
 	int ip6_rt_min_advmss;
 	u32 multipath_hash_fields;
@@ -70,14 +66,12 @@ struct netns_ipv6 {
 	struct fib6_info	*fib6_null_entry;
 	struct rt6_info		*ip6_null_entry;
 	struct rt6_statistics   *rt6_stats;
-	struct timer_list       ip6_fib_timer;
 	struct hlist_head       *fib_table_hash;
 	struct fib6_table       *fib6_main_tbl;
 	struct list_head	fib6_walkers;
 	rwlock_t		fib6_walker_lock;
 	spinlock_t		fib6_gc_lock;
 	atomic_t		ip6_rt_gc_expire;
-	unsigned long		ip6_rt_last_gc;
 	unsigned char		flowlabel_has_excl;
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
 	bool			fib6_has_custom_rules;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 8a10a0355816..72c4c4bf4289 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1391,21 +1391,6 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
 	return 0;
 }
 
-static void fib6_start_gc(struct net *net, struct fib6_info *rt)
-{
-	if (!timer_pending(&net->ipv6.ip6_fib_timer) &&
-	    (rt->fib6_flags & RTF_EXPIRES))
-		mod_timer(&net->ipv6.ip6_fib_timer,
-			  jiffies + net->ipv6.sysctl.ip6_rt_gc_interval);
-}
-
-void fib6_force_start_gc(struct net *net)
-{
-	if (!timer_pending(&net->ipv6.ip6_fib_timer))
-		mod_timer(&net->ipv6.ip6_fib_timer,
-			  jiffies + net->ipv6.sysctl.ip6_rt_gc_interval);
-}
-
 static void __fib6_update_sernum_upto_root(struct fib6_info *rt,
 					   int sernum)
 {
@@ -1549,7 +1534,6 @@ int fib6_add(struct fib6_node *root, struct fib6_info *rt,
 		if (rt->nh)
 			list_add(&rt->nh_list, &rt->nh->f6i_list);
 		__fib6_update_sernum_upto_root(rt, fib6_new_sernum(info->nl_net));
-		fib6_start_gc(info->nl_net, rt);
 	}
 
 out:
@@ -2354,59 +2338,6 @@ static void fib6_flush_trees(struct net *net)
 	__fib6_clean_all(net, NULL, new_sernum, NULL, false);
 }
 
-/*
- *	Garbage collection
- */
-
-static int fib6_age(struct fib6_info *rt, void *arg)
-{
-	struct fib6_gc_args *gc_args = arg;
-	unsigned long now = jiffies;
-
-	/*	Also age clones in the exception table.
-	 *	Note, that clones are aged out
-	 *	only if they are not in use now.
-	 */
-	rt6_age_exceptions(rt, gc_args, now);
-
-	return 0;
-}
-
-void fib6_run_gc(unsigned long expires, struct net *net, bool force)
-{
-	struct fib6_gc_args gc_args;
-	unsigned long now;
-
-	if (force) {
-		spin_lock_bh(&net->ipv6.fib6_gc_lock);
-	} else if (!spin_trylock_bh(&net->ipv6.fib6_gc_lock)) {
-		mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
-		return;
-	}
-	gc_args.timeout = expires ? (int)expires :
-			  net->ipv6.sysctl.ip6_rt_gc_interval;
-	gc_args.more = 0;
-
-	fib6_clean_all(net, fib6_age, &gc_args);
-	now = jiffies;
-	net->ipv6.ip6_rt_last_gc = now;
-
-	if (gc_args.more)
-		mod_timer(&net->ipv6.ip6_fib_timer,
-			  round_jiffies(now
-					+ net->ipv6.sysctl.ip6_rt_gc_interval));
-	else
-		del_timer(&net->ipv6.ip6_fib_timer);
-	spin_unlock_bh(&net->ipv6.fib6_gc_lock);
-}
-
-static void fib6_gc_timer_cb(struct timer_list *t)
-{
-	struct net *arg = from_timer(arg, t, ipv6.ip6_fib_timer);
-
-	fib6_run_gc(0, arg, true);
-}
-
 static int __net_init fib6_net_init(struct net *net)
 {
 	size_t size = sizeof(struct hlist_head) * FIB6_TABLE_HASHSZ;
@@ -2423,7 +2354,6 @@ static int __net_init fib6_net_init(struct net *net)
 	spin_lock_init(&net->ipv6.fib6_gc_lock);
 	rwlock_init(&net->ipv6.fib6_walker_lock);
 	INIT_LIST_HEAD(&net->ipv6.fib6_walkers);
-	timer_setup(&net->ipv6.ip6_fib_timer, fib6_gc_timer_cb, 0);
 
 	net->ipv6.rt6_stats = kzalloc(sizeof(*net->ipv6.rt6_stats), GFP_KERNEL);
 	if (!net->ipv6.rt6_stats)
@@ -2481,8 +2411,6 @@ static void fib6_net_exit(struct net *net)
 {
 	unsigned int i;
 
-	del_timer_sync(&net->ipv6.ip6_fib_timer);
-
 	for (i = 0; i < FIB6_TABLE_HASHSZ; i++) {
 		struct hlist_head *head = &net->ipv6.fib_table_hash[i];
 		struct hlist_node *tmp;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 1d4cf7f73097..714e4fd8c13d 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1869,7 +1869,6 @@ static int ndisc_netdev_event(struct notifier_block *this, unsigned long event,
 	switch (event) {
 	case NETDEV_CHANGEADDR:
 		neigh_changeaddr(&nd_tbl, dev);
-		fib6_run_gc(0, net, false);
 		fallthrough;
 	case NETDEV_UP:
 		idev = in6_dev_get(dev);
@@ -1898,7 +1897,6 @@ static int ndisc_netdev_event(struct notifier_block *this, unsigned long event,
 		break;
 	case NETDEV_DOWN:
 		neigh_ifdown(&nd_tbl, dev);
-		fib6_run_gc(0, net, false);
 		break;
 	case NETDEV_NOTIFY_PEERS:
 		ndisc_send_unsol_na(dev);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 87721a2a91b6..1d2ebbfc5090 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -249,8 +249,7 @@ static void ip6_confirm_neigh(const struct dst_entry *dst, const void *daddr)
 
 static struct dst_ops ip6_dst_ops_template = {
 	.family			=	AF_INET6,
-	.gc			=	ip6_dst_gc,
-	.gc_thresh		=	1024,
+	.gc_thresh		=	~0,
 	.check			=	ip6_dst_check,
 	.default_advmss		=	ip6_default_advmss,
 	.mtu			=	ip6_mtu,
@@ -1724,7 +1723,6 @@ static int rt6_insert_exception(struct rt6_info *nrt,
 		spin_lock_bh(&f6i->fib6_table->tb6_lock);
 		fib6_update_sernum(net, f6i);
 		spin_unlock_bh(&f6i->fib6_table->tb6_lock);
-		fib6_force_start_gc(net);
 	}
 
 	return err;
@@ -3283,28 +3281,6 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
 	return dst;
 }
 
-static void ip6_dst_gc(struct dst_ops *ops)
-{
-	struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
-	int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
-	int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
-	int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
-	unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
-	unsigned int val;
-	int entries;
-
-	if (time_after(rt_last_gc + rt_min_interval, jiffies))
-		goto out;
-
-	fib6_run_gc(atomic_inc_return(&net->ipv6.ip6_rt_gc_expire), net, true);
-	entries = dst_entries_get_slow(ops);
-	if (entries < ops->gc_thresh)
-		atomic_set(&net->ipv6.ip6_rt_gc_expire, rt_gc_timeout >> 1);
-out:
-	val = atomic_read(&net->ipv6.ip6_rt_gc_expire);
-	atomic_set(&net->ipv6.ip6_rt_gc_expire, val - (val >> rt_elasticity));
-}
-
 static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg,
 			       const struct in6_addr *gw_addr, u32 tbid,
 			       int flags, struct fib6_result *res)
@@ -6319,25 +6295,6 @@ static int rt6_stats_seq_show(struct seq_file *seq, void *v)
 
 #ifdef CONFIG_SYSCTL
 
-static int ipv6_sysctl_rtcache_flush(struct ctl_table *ctl, int write,
-			      void *buffer, size_t *lenp, loff_t *ppos)
-{
-	struct net *net;
-	int delay;
-	int ret;
-	if (!write)
-		return -EINVAL;
-
-	net = (struct net *)ctl->extra1;
-	delay = net->ipv6.sysctl.flush_delay;
-	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
-	if (ret)
-		return ret;
-
-	fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
-	return 0;
-}
-
 static struct ctl_table ipv6_route_table_template[] = {
 	{
 		.procname	=	"max_size",
@@ -6346,48 +6303,6 @@ static struct ctl_table ipv6_route_table_template[] = {
 		.mode		=	0644,
 		.proc_handler	=	proc_dointvec,
 	},
-	{
-		.procname	=	"gc_thresh",
-		.data		=	&ip6_dst_ops_template.gc_thresh,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec,
-	},
-	{
-		.procname	=	"flush",
-		.data		=	&init_net.ipv6.sysctl.flush_delay,
-		.maxlen		=	sizeof(int),
-		.mode		=	0200,
-		.proc_handler	=	ipv6_sysctl_rtcache_flush
-	},
-	{
-		.procname	=	"gc_min_interval",
-		.data		=	&init_net.ipv6.sysctl.ip6_rt_gc_min_interval,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec_jiffies,
-	},
-	{
-		.procname	=	"gc_timeout",
-		.data		=	&init_net.ipv6.sysctl.ip6_rt_gc_timeout,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec_jiffies,
-	},
-	{
-		.procname	=	"gc_interval",
-		.data		=	&init_net.ipv6.sysctl.ip6_rt_gc_interval,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec_jiffies,
-	},
-	{
-		.procname	=	"gc_elasticity",
-		.data		=	&init_net.ipv6.sysctl.ip6_rt_gc_elasticity,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec,
-	},
 	{
 		.procname	=	"mtu_expires",
 		.data		=	&init_net.ipv6.sysctl.ip6_rt_mtu_expires,
@@ -6402,13 +6317,6 @@ static struct ctl_table ipv6_route_table_template[] = {
 		.mode		=	0644,
 		.proc_handler	=	proc_dointvec,
 	},
-	{
-		.procname	=	"gc_min_interval_ms",
-		.data		=	&init_net.ipv6.sysctl.ip6_rt_gc_min_interval,
-		.maxlen		=	sizeof(int),
-		.mode		=	0644,
-		.proc_handler	=	proc_dointvec_ms_jiffies,
-	},
 	{
 		.procname	=	"skip_notify_on_dev_down",
 		.data		=	&init_net.ipv6.sysctl.skip_notify_on_dev_down,
@@ -6431,17 +6339,9 @@ struct ctl_table * __net_init ipv6_route_sysctl_init(struct net *net)
 
 	if (table) {
 		table[0].data = &net->ipv6.sysctl.ip6_rt_max_size;
-		table[1].data = &net->ipv6.ip6_dst_ops.gc_thresh;
-		table[2].data = &net->ipv6.sysctl.flush_delay;
-		table[2].extra1 = net;
-		table[3].data = &net->ipv6.sysctl.ip6_rt_gc_min_interval;
-		table[4].data = &net->ipv6.sysctl.ip6_rt_gc_timeout;
-		table[5].data = &net->ipv6.sysctl.ip6_rt_gc_interval;
-		table[6].data = &net->ipv6.sysctl.ip6_rt_gc_elasticity;
-		table[7].data = &net->ipv6.sysctl.ip6_rt_mtu_expires;
-		table[8].data = &net->ipv6.sysctl.ip6_rt_min_advmss;
-		table[9].data = &net->ipv6.sysctl.ip6_rt_gc_min_interval;
-		table[10].data = &net->ipv6.sysctl.skip_notify_on_dev_down;
+		table[1].data = &net->ipv6.sysctl.ip6_rt_mtu_expires;
+		table[2].data = &net->ipv6.sysctl.ip6_rt_min_advmss;
+		table[3].data = &net->ipv6.sysctl.skip_notify_on_dev_down;
 
 		/* Don't export sysctls to unprivileged users */
 		if (net->user_ns != &init_user_ns)
@@ -6504,12 +6404,7 @@ static int __net_init ip6_route_net_init(struct net *net)
 #endif
 #endif
 
-	net->ipv6.sysctl.flush_delay = 0;
 	net->ipv6.sysctl.ip6_rt_max_size = INT_MAX;
-	net->ipv6.sysctl.ip6_rt_gc_min_interval = HZ / 2;
-	net->ipv6.sysctl.ip6_rt_gc_timeout = 60*HZ;
-	net->ipv6.sysctl.ip6_rt_gc_interval = 30*HZ;
-	net->ipv6.sysctl.ip6_rt_gc_elasticity = 9;
 	net->ipv6.sysctl.ip6_rt_mtu_expires = 10*60*HZ;
 	net->ipv6.sysctl.ip6_rt_min_advmss = IPV6_MIN_MTU - 20 - 40;
 	net->ipv6.sysctl.skip_notify_on_dev_down = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next 2/2] net: Remove unused code and variables.
  2023-05-17  4:27 ` [RFC PATCH net-next 2/2] net: Remove unused code and variables Kui-Feng Lee
@ 2023-05-17 14:56   ` Simon Horman
       [not found]     ` <CAFVMQ6SX=CES2EpovCbB5eSZjZuSbxAmoACbNdLCGWdU8kf=PQ@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Simon Horman @ 2023-05-17 14:56 UTC (permalink / raw)
  To: Kui-Feng Lee; +Cc: netdev, ast, martin.lau, kernel-team, Kui-Feng Lee

On Tue, May 16, 2023 at 09:27:57PM -0700, Kui-Feng Lee wrote:
> Since GC has been removed, some functions and variables are useless.  That
> includes some sysctl variables that control GC.
> 
> Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>

Hi Kui-Feng,

thanks for your patch.
Some initial review from my side.

> -static void fib6_gc_timer_cb(struct timer_list *t)
> -{
> -	struct net *arg = from_timer(arg, t, ipv6.ip6_fib_timer);
> -
> -	fib6_run_gc(0, arg, true);
> -}
> -

There is a forward declaration of fib6_gc_timer_cb around line 80.
It should be removed too.

...

> @@ -3283,28 +3281,6 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
>  	return dst;
>  }
>  
> -static void ip6_dst_gc(struct dst_ops *ops)

There is a forward declaration of ip6_dst_gc() around line 94.
That should be deleted too.

> -{
> -	struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
> -	int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
> -	int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
> -	int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
> -	unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
> -	unsigned int val;
> -	int entries;
> -
> -	if (time_after(rt_last_gc + rt_min_interval, jiffies))
> -		goto out;
> -
> -	fib6_run_gc(atomic_inc_return(&net->ipv6.ip6_rt_gc_expire), net, true);
> -	entries = dst_entries_get_slow(ops);
> -	if (entries < ops->gc_thresh)
> -		atomic_set(&net->ipv6.ip6_rt_gc_expire, rt_gc_timeout >> 1);
> -out:
> -	val = atomic_read(&net->ipv6.ip6_rt_gc_expire);
> -	atomic_set(&net->ipv6.ip6_rt_gc_expire, val - (val >> rt_elasticity));
> -}
> -

...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers.
  2023-05-17  4:27 ` [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers Kui-Feng Lee
@ 2023-05-17 15:26   ` Jakub Kicinski
  0 siblings, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2023-05-17 15:26 UTC (permalink / raw)
  To: Kui-Feng Lee; +Cc: netdev, ast, martin.lau, kernel-team, Kui-Feng Lee

On Tue, 16 May 2023 21:27:56 -0700 Kui-Feng Lee wrote:
> To: netdev@vger.kernel.org,  ast@kernel.org,  martin.lau@linux.dev,  kernel-team@meta.com
> Cc: Kui-Feng Lee <kuifeng@meta.com>

How do you expect the IPv6 maintainer to find this series? :(
Please use get_maintainer.pl to CC the correct maintainers.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fwd: [RFC PATCH net-next 2/2] net: Remove unused code and variables.
       [not found]     ` <CAFVMQ6SX=CES2EpovCbB5eSZjZuSbxAmoACbNdLCGWdU8kf=PQ@mail.gmail.com>
@ 2023-05-17 17:34       ` Kui-Feng Lee
  0 siblings, 0 replies; 6+ messages in thread
From: Kui-Feng Lee @ 2023-05-17 17:34 UTC (permalink / raw)
  To: Thinker Li
  Cc: netdev, ast, martin.lau, kernel-team, Kui-Feng Lee, davem,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Maciej Żenczykowski, Mahesh Bandewar

Thank you for comments.

On 5/17/23 10:27, Thinker Li wrote:
> 
> 
> 
> On Tue, May 16, 2023 at 09:27:57PM -0700, Kui-Feng Lee wrote:
>  > Since GC has been removed, some functions and variables are useless.  
> That
>  > includes some sysctl variables that control GC.
>  >
>  > Signed-off-by: Kui-Feng Lee <kuifeng@meta.com <mailto:kuifeng@meta.com>>
> 
> Hi Kui-Feng,
> 
> thanks for your patch.
> Some initial review from my side.
> 
>  > -static void fib6_gc_timer_cb(struct timer_list *t)
>  > -{
>  > -     struct net *arg = from_timer(arg, t, ipv6.ip6_fib_timer);
>  > -
>  > -     fib6_run_gc(0, arg, true);
>  > -}
>  > -
> 
> There is a forward declaration of fib6_gc_timer_cb around line 80.
> It should be removed too.

Got it!

> 
> ...
> 
>  > @@ -3283,28 +3281,6 @@ struct dst_entry *icmp6_dst_alloc(struct 
> net_device *dev,
>  >       return dst;
>  >  }
>  >
>  > -static void ip6_dst_gc(struct dst_ops *ops)
> 
> There is a forward declaration of ip6_dst_gc() around line 94.
> That should be deleted too.
> 
Got it!

>  > -{
>  > -     struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
>  > -     int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
>  > -     int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
>  > -     int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
>  > -     unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
>  > -     unsigned int val;
>  > -     int entries;
>  > -
>  > -     if (time_after(rt_last_gc + rt_min_interval, jiffies))
>  > -             goto out;
>  > -
>  > -     fib6_run_gc(atomic_inc_return(&net->ipv6.ip6_rt_gc_expire), 
> net, true);
>  > -     entries = dst_entries_get_slow(ops);
>  > -     if (entries < ops->gc_thresh)
>  > -             atomic_set(&net->ipv6.ip6_rt_gc_expire, rt_gc_timeout 
>  >> 1);
>  > -out:
>  > -     val = atomic_read(&net->ipv6.ip6_rt_gc_expire);
>  > -     atomic_set(&net->ipv6.ip6_rt_gc_expire, val - (val >> 
> rt_elasticity));
>  > -}
>  > -
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-17 17:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-17  4:27 [RFC PATCH net-next 0/2] Mitigate the Issue of Expired Routes in Linux IPv6 Routing Tables Kui-Feng Lee
2023-05-17  4:27 ` [RFC PATCH net-next 1/2] net: Remove expired routes with separated timers Kui-Feng Lee
2023-05-17 15:26   ` Jakub Kicinski
2023-05-17  4:27 ` [RFC PATCH net-next 2/2] net: Remove unused code and variables Kui-Feng Lee
2023-05-17 14:56   ` Simon Horman
     [not found]     ` <CAFVMQ6SX=CES2EpovCbB5eSZjZuSbxAmoACbNdLCGWdU8kf=PQ@mail.gmail.com>
2023-05-17 17:34       ` Fwd: " Kui-Feng Lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).