netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops
@ 2025-08-16 23:12 Christoph Paasch via B4 Relay
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Christoph Paasch via B4 Relay @ 2025-08-16 23:12 UTC (permalink / raw)
  To: David Ahern, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev, Christoph Paasch

Configuring a very large number of nexthops is fairly possible within a
reasonable time-frame. But, certain netlink commands can become
extremely slow.

This series addresses some of these, namely dumping and removing
nexthops.

Signed-off-by: Christoph Paasch <cpaasch@openai.com>
---
Changes in v2:
- Added another improvement to the series "net: When removing nexthops,
  don't call synchronize_net if it is not necessary"
- Fixed typos, made comments within 80-character limit and unified
  comment-style. (Ido Schimmel)
- Removed if (nh->id < s_idx) in the for-loop as it is no more needed.
  (Ido Schimmel)
- Link to v1: https://lore.kernel.org/r/20250724-nexthop_dump-v1-1-6b43fffd5bac@openai.com

---
Christoph Paasch (2):
      net: Make nexthop-dumps scale linearly with the number of nexthops
      net: When removing nexthops, don't call synchronize_net if it is not necessary

 net/ipv4/nexthop.c | 42 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)
---
base-commit: bab3ce404553de56242d7b09ad7ea5b70441ea41
change-id: 20250724-nexthop_dump-f6c32472bcdf

Best regards,
-- 
Christoph Paasch <cpaasch@openai.com>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number of nexthops
  2025-08-16 23:12 [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops Christoph Paasch via B4 Relay
@ 2025-08-16 23:12 ` Christoph Paasch via B4 Relay
  2025-08-17  8:42   ` Ido Schimmel
                     ` (2 more replies)
  2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 11+ messages in thread
From: Christoph Paasch via B4 Relay @ 2025-08-16 23:12 UTC (permalink / raw)
  To: David Ahern, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev, Christoph Paasch

From: Christoph Paasch <cpaasch@openai.com>

When we have a (very) large number of nexthops, they do not fit within a
single message. rtm_dump_walk_nexthops() thus will be called repeatedly
and ctx->idx is used to avoid dumping the same nexthops again.

The approach in which we avoid dumping the same nexthops is by basically
walking the entire nexthop rb-tree from the left-most node until we find
a node whose id is >= s_idx. That does not scale well.

Instead of this inefficient approach, rather go directly through the
tree to the nexthop that should be dumped (the one whose nh_id >=
s_idx). This allows us to find the relevant node in O(log(n)).

We have quite a nice improvement with this:

Before:
=======

--> ~1M nexthops:
$ time ~/libnl/src/nl-nh-list | wc -l
1050624

real	0m21.080s
user	0m0.666s
sys	0m20.384s

--> ~2M nexthops:
$ time ~/libnl/src/nl-nh-list | wc -l
2101248

real	1m51.649s
user	0m1.540s
sys	1m49.908s

After:
======

--> ~1M nexthops:
$ time ~/libnl/src/nl-nh-list | wc -l
1050624

real	0m1.157s
user	0m0.926s
sys	0m0.259s

--> ~2M nexthops:
$ time ~/libnl/src/nl-nh-list | wc -l
2101248

real	0m2.763s
user	0m2.042s
sys	0m0.776s

Signed-off-by: Christoph Paasch <cpaasch@openai.com>
---
 net/ipv4/nexthop.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 29118c43ebf5f1e91292fe227d4afde313e564bb..509004bfd08ec43de44c7ce4a540c983d0e70201 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -3511,12 +3511,42 @@ static int rtm_dump_walk_nexthops(struct sk_buff *skb,
 	int err;
 
 	s_idx = ctx->idx;
-	for (node = rb_first(root); node; node = rb_next(node)) {
+
+	/* If this is not the first invocation, ctx->idx will contain the id of
+	 * the last nexthop we processed. Instead of starting from the very
+	 * first element of the red/black tree again and linearly skipping the
+	 * (potentially large) set of nodes with an id smaller than s_idx, walk
+	 * the tree and find the left-most node whose id is >= s_idx.  This
+	 * provides an efficient O(log n) starting point for the dump
+	 * continuation.
+	 */
+	if (s_idx != 0) {
+		struct rb_node *tmp = root->rb_node;
+
+		node = NULL;
+		while (tmp) {
+			struct nexthop *nh;
+
+			nh = rb_entry(tmp, struct nexthop, rb_node);
+			if (nh->id < s_idx) {
+				tmp = tmp->rb_right;
+			} else {
+				/* Track current candidate and keep looking on
+				 * the left side to find the left-most
+				 * (smallest id) that is still >= s_idx.
+				 */
+				node = tmp;
+				tmp = tmp->rb_left;
+			}
+		}
+	} else {
+		node = rb_first(root);
+	}
+
+	for (; node; node = rb_next(node)) {
 		struct nexthop *nh;
 
 		nh = rb_entry(node, struct nexthop, rb_node);
-		if (nh->id < s_idx)
-			continue;
 
 		ctx->idx = nh->id;
 		err = nh_cb(skb, cb, nh, data);

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary
  2025-08-16 23:12 [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops Christoph Paasch via B4 Relay
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
@ 2025-08-16 23:12 ` Christoph Paasch via B4 Relay
  2025-08-17  8:44   ` Ido Schimmel
                     ` (2 more replies)
  2025-08-18 13:33 ` [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops David Ahern
  2025-08-20  1:00 ` patchwork-bot+netdevbpf
  3 siblings, 3 replies; 11+ messages in thread
From: Christoph Paasch via B4 Relay @ 2025-08-16 23:12 UTC (permalink / raw)
  To: David Ahern, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev, Christoph Paasch

From: Christoph Paasch <cpaasch@openai.com>

When removing a nexthop, commit
90f33bffa382 ("nexthops: don't modify published nexthop groups") added a
call to synchronize_rcu() (later changed to _net()) to make sure
everyone sees the new nexthop-group before the rtnl-lock is released.

When one wants to delete a large number of groups and nexthops, it is
fastest to first flush the groups (ip nexthop flush groups) and then
flush the nexthops themselves (ip -6 nexthop flush). As that way the
groups don't need to be rebalanced.

However, `ip -6 nexthop flush` will still take a long time if there is
a very large number of nexthops because of the call to
synchronize_net(). Now, if there are no more groups, there is no point
in calling synchronize_net(). So, let's skip that entirely by checking
if nh->grp_list is empty.

This gives us a nice speedup:

BEFORE:
=======

$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 2097152 nexthops

real	1m45.345s
user	0m0.001s
sys	0m0.005s

$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 4194304 nexthops

real	3m10.430s
user	0m0.002s
sys	0m0.004s

AFTER:
======

$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 2097152 nexthops

real	0m17.545s
user	0m0.003s
sys	0m0.003s

$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 4194304 nexthops

real	0m35.823s
user	0m0.002s
sys	0m0.004s

Signed-off-by: Christoph Paasch <cpaasch@openai.com>
---
 net/ipv4/nexthop.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 509004bfd08ec43de44c7ce4a540c983d0e70201..0a20625f5ffb471052d92b48802076b8295dd703 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -2087,6 +2087,12 @@ static void remove_nexthop_from_groups(struct net *net, struct nexthop *nh,
 {
 	struct nh_grp_entry *nhge, *tmp;
 
+	/* If there is nothing to do, let's avoid the costly call to
+	 * synchronize_net()
+	 */
+	if (list_empty(&nh->grp_list))
+		return;
+
 	list_for_each_entry_safe(nhge, tmp, &nh->grp_list, nh_list)
 		remove_nh_grp_entry(net, nhge, nlinfo);
 

-- 
2.50.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number of nexthops
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
@ 2025-08-17  8:42   ` Ido Schimmel
  2025-08-17  9:40   ` Nikolay Aleksandrov
  2025-08-18  9:54   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Ido Schimmel @ 2025-08-17  8:42 UTC (permalink / raw)
  To: cpaasch
  Cc: David Ahern, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netdev

On Sat, Aug 16, 2025 at 04:12:48PM -0700, Christoph Paasch via B4 Relay wrote:
> From: Christoph Paasch <cpaasch@openai.com>
> 
> When we have a (very) large number of nexthops, they do not fit within a
> single message. rtm_dump_walk_nexthops() thus will be called repeatedly
> and ctx->idx is used to avoid dumping the same nexthops again.
> 
> The approach in which we avoid dumping the same nexthops is by basically
> walking the entire nexthop rb-tree from the left-most node until we find
> a node whose id is >= s_idx. That does not scale well.
> 
> Instead of this inefficient approach, rather go directly through the
> tree to the nexthop that should be dumped (the one whose nh_id >=
> s_idx). This allows us to find the relevant node in O(log(n)).

[...]

> Signed-off-by: Christoph Paasch <cpaasch@openai.com>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary
  2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
@ 2025-08-17  8:44   ` Ido Schimmel
  2025-08-17  9:40   ` Nikolay Aleksandrov
  2025-08-18 10:02   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Ido Schimmel @ 2025-08-17  8:44 UTC (permalink / raw)
  To: cpaasch
  Cc: David Ahern, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netdev

On Sat, Aug 16, 2025 at 04:12:49PM -0700, Christoph Paasch via B4 Relay wrote:
> From: Christoph Paasch <cpaasch@openai.com>
> 
> When removing a nexthop, commit
> 90f33bffa382 ("nexthops: don't modify published nexthop groups") added a
> call to synchronize_rcu() (later changed to _net()) to make sure
> everyone sees the new nexthop-group before the rtnl-lock is released.
> 
> When one wants to delete a large number of groups and nexthops, it is
> fastest to first flush the groups (ip nexthop flush groups) and then
> flush the nexthops themselves (ip -6 nexthop flush). As that way the
> groups don't need to be rebalanced.
> 
> However, `ip -6 nexthop flush` will still take a long time if there is
> a very large number of nexthops because of the call to
> synchronize_net(). Now, if there are no more groups, there is no point
> in calling synchronize_net(). So, let's skip that entirely by checking
> if nh->grp_list is empty.

[...]

> Signed-off-by: Christoph Paasch <cpaasch@openai.com>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number of nexthops
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
  2025-08-17  8:42   ` Ido Schimmel
@ 2025-08-17  9:40   ` Nikolay Aleksandrov
  2025-08-18  9:54   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Nikolay Aleksandrov @ 2025-08-17  9:40 UTC (permalink / raw)
  To: cpaasch, David Ahern, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev

On 8/17/25 02:12, Christoph Paasch via B4 Relay wrote:
> From: Christoph Paasch <cpaasch@openai.com>
> 
> When we have a (very) large number of nexthops, they do not fit within a
> single message. rtm_dump_walk_nexthops() thus will be called repeatedly
> and ctx->idx is used to avoid dumping the same nexthops again.
> 
> The approach in which we avoid dumping the same nexthops is by basically
> walking the entire nexthop rb-tree from the left-most node until we find
> a node whose id is >= s_idx. That does not scale well.
> 
> Instead of this inefficient approach, rather go directly through the
> tree to the nexthop that should be dumped (the one whose nh_id >=
> s_idx). This allows us to find the relevant node in O(log(n)).
> 
> We have quite a nice improvement with this:
> 
> Before:
> =======
> 
> --> ~1M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 1050624
> 
> real	0m21.080s
> user	0m0.666s
> sys	0m20.384s
> 
> --> ~2M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 2101248
> 
> real	1m51.649s
> user	0m1.540s
> sys	1m49.908s
> 
> After:
> ======
> 
> --> ~1M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 1050624
> 
> real	0m1.157s
> user	0m0.926s
> sys	0m0.259s
> 
> --> ~2M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 2101248
> 
> real	0m2.763s
> user	0m2.042s
> sys	0m0.776s
> 
> Signed-off-by: Christoph Paasch <cpaasch@openai.com>
> ---
>  net/ipv4/nexthop.c | 36 +++++++++++++++++++++++++++++++++---
>  1 file changed, 33 insertions(+), 3 deletions(-)
> 

Very nice,
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary
  2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
  2025-08-17  8:44   ` Ido Schimmel
@ 2025-08-17  9:40   ` Nikolay Aleksandrov
  2025-08-18 10:02   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Nikolay Aleksandrov @ 2025-08-17  9:40 UTC (permalink / raw)
  To: cpaasch, David Ahern, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev

On 8/17/25 02:12, Christoph Paasch via B4 Relay wrote:
> From: Christoph Paasch <cpaasch@openai.com>
> 
> When removing a nexthop, commit
> 90f33bffa382 ("nexthops: don't modify published nexthop groups") added a
> call to synchronize_rcu() (later changed to _net()) to make sure
> everyone sees the new nexthop-group before the rtnl-lock is released.
> 
> When one wants to delete a large number of groups and nexthops, it is
> fastest to first flush the groups (ip nexthop flush groups) and then
> flush the nexthops themselves (ip -6 nexthop flush). As that way the
> groups don't need to be rebalanced.
> 
> However, `ip -6 nexthop flush` will still take a long time if there is
> a very large number of nexthops because of the call to
> synchronize_net(). Now, if there are no more groups, there is no point
> in calling synchronize_net(). So, let's skip that entirely by checking
> if nh->grp_list is empty.
> 
> This gives us a nice speedup:
> 
> BEFORE:
> =======
> 
> $ time sudo ip -6 nexthop flush
> Dump was interrupted and may be inconsistent.
> Flushed 2097152 nexthops
> 
> real	1m45.345s
> user	0m0.001s
> sys	0m0.005s
> 
> $ time sudo ip -6 nexthop flush
> Dump was interrupted and may be inconsistent.
> Flushed 4194304 nexthops
> 
> real	3m10.430s
> user	0m0.002s
> sys	0m0.004s
> 
> AFTER:
> ======
> 
> $ time sudo ip -6 nexthop flush
> Dump was interrupted and may be inconsistent.
> Flushed 2097152 nexthops
> 
> real	0m17.545s
> user	0m0.003s
> sys	0m0.003s
> 
> $ time sudo ip -6 nexthop flush
> Dump was interrupted and may be inconsistent.
> Flushed 4194304 nexthops
> 
> real	0m35.823s
> user	0m0.002s
> sys	0m0.004s
> 
> Signed-off-by: Christoph Paasch <cpaasch@openai.com>
> ---
>  net/ipv4/nexthop.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 509004bfd08ec43de44c7ce4a540c983d0e70201..0a20625f5ffb471052d92b48802076b8295dd703 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -2087,6 +2087,12 @@ static void remove_nexthop_from_groups(struct net *net, struct nexthop *nh,
>  {
>  	struct nh_grp_entry *nhge, *tmp;
>  
> +	/* If there is nothing to do, let's avoid the costly call to
> +	 * synchronize_net()
> +	 */
> +	if (list_empty(&nh->grp_list))
> +		return;
> +
>  	list_for_each_entry_safe(nhge, tmp, &nh->grp_list, nh_list)
>  		remove_nh_grp_entry(net, nhge, nlinfo);
>  
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number of nexthops
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
  2025-08-17  8:42   ` Ido Schimmel
  2025-08-17  9:40   ` Nikolay Aleksandrov
@ 2025-08-18  9:54   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-08-18  9:54 UTC (permalink / raw)
  To: cpaasch
  Cc: David Ahern, Nikolay Aleksandrov, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Ido Schimmel, netdev

On Sat, Aug 16, 2025 at 4:13 PM Christoph Paasch via B4 Relay
<devnull+cpaasch.openai.com@kernel.org> wrote:
>
> From: Christoph Paasch <cpaasch@openai.com>
>
> When we have a (very) large number of nexthops, they do not fit within a
> single message. rtm_dump_walk_nexthops() thus will be called repeatedly
> and ctx->idx is used to avoid dumping the same nexthops again.
>
> The approach in which we avoid dumping the same nexthops is by basically
> walking the entire nexthop rb-tree from the left-most node until we find
> a node whose id is >= s_idx. That does not scale well.
>
> Instead of this inefficient approach, rather go directly through the
> tree to the nexthop that should be dumped (the one whose nh_id >=
> s_idx). This allows us to find the relevant node in O(log(n)).
>
> We have quite a nice improvement with this:
>
> Before:
> =======
>
> --> ~1M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 1050624
>
> real    0m21.080s
> user    0m0.666s
> sys     0m20.384s
>
> --> ~2M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 2101248
>
> real    1m51.649s
> user    0m1.540s
> sys     1m49.908s
>
> After:
> ======
>
> --> ~1M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 1050624
>
> real    0m1.157s
> user    0m0.926s
> sys     0m0.259s
>
> --> ~2M nexthops:
> $ time ~/libnl/src/nl-nh-list | wc -l
> 2101248
>
> real    0m2.763s
> user    0m2.042s
> sys     0m0.776s
>
> Signed-off-by: Christoph Paasch <cpaasch@openai.com>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary
  2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
  2025-08-17  8:44   ` Ido Schimmel
  2025-08-17  9:40   ` Nikolay Aleksandrov
@ 2025-08-18 10:02   ` Eric Dumazet
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-08-18 10:02 UTC (permalink / raw)
  To: cpaasch
  Cc: David Ahern, Nikolay Aleksandrov, David S. Miller, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Ido Schimmel, netdev

On Sat, Aug 16, 2025 at 4:13 PM Christoph Paasch via B4 Relay
<devnull+cpaasch.openai.com@kernel.org> wrote:
>
> From: Christoph Paasch <cpaasch@openai.com>
>
> When removing a nexthop, commit
> 90f33bffa382 ("nexthops: don't modify published nexthop groups") added a
> call to synchronize_rcu() (later changed to _net()) to make sure
> everyone sees the new nexthop-group before the rtnl-lock is released.
>
> When one wants to delete a large number of groups and nexthops, it is
> fastest to first flush the groups (ip nexthop flush groups) and then
> flush the nexthops themselves (ip -6 nexthop flush). As that way the
> groups don't need to be rebalanced.
>
> However, `ip -6 nexthop flush` will still take a long time if there is
> a very large number of nexthops because of the call to
> synchronize_net(). Now, if there are no more groups, there is no point
> in calling synchronize_net(). So, let's skip that entirely by checking
> if nh->grp_list is empty.
>

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops
  2025-08-16 23:12 [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops Christoph Paasch via B4 Relay
  2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
  2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
@ 2025-08-18 13:33 ` David Ahern
  2025-08-20  1:00 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 11+ messages in thread
From: David Ahern @ 2025-08-18 13:33 UTC (permalink / raw)
  To: cpaasch, Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ido Schimmel
  Cc: netdev

On 8/16/25 5:12 PM, Christoph Paasch via B4 Relay wrote:
> Configuring a very large number of nexthops is fairly possible within a
> reasonable time-frame. But, certain netlink commands can become
> extremely slow.
> 
> This series addresses some of these, namely dumping and removing
> nexthops.
> 
> Signed-off-by: Christoph Paasch <cpaasch@openai.com>
> ---
> Changes in v2:
> - Added another improvement to the series "net: When removing nexthops,
>   don't call synchronize_net if it is not necessary"
> - Fixed typos, made comments within 80-character limit and unified
>   comment-style. (Ido Schimmel)
> - Removed if (nh->id < s_idx) in the for-loop as it is no more needed.
>   (Ido Schimmel)
> - Link to v1: https://lore.kernel.org/r/20250724-nexthop_dump-v1-1-6b43fffd5bac@openai.com
> 
> ---
> Christoph Paasch (2):
>       net: Make nexthop-dumps scale linearly with the number of nexthops
>       net: When removing nexthops, don't call synchronize_net if it is not necessary
> 
>  net/ipv4/nexthop.c | 42 +++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 39 insertions(+), 3 deletions(-)
> ---
> base-commit: bab3ce404553de56242d7b09ad7ea5b70441ea41
> change-id: 20250724-nexthop_dump-f6c32472bcdf
> 
> Best regards,

For the set:
Reviewed-by: David Ahern <dsahern@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops
  2025-08-16 23:12 [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops Christoph Paasch via B4 Relay
                   ` (2 preceding siblings ...)
  2025-08-18 13:33 ` [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops David Ahern
@ 2025-08-20  1:00 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-08-20  1:00 UTC (permalink / raw)
  To: Christoph Paasch
  Cc: dsahern, razor, davem, edumazet, kuba, pabeni, horms, idosch,
	netdev

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 16 Aug 2025 16:12:47 -0700 you wrote:
> Configuring a very large number of nexthops is fairly possible within a
> reasonable time-frame. But, certain netlink commands can become
> extremely slow.
> 
> This series addresses some of these, namely dumping and removing
> nexthops.
> 
> [...]

Here is the summary with links:
  - [net-next,v2,1/2] net: Make nexthop-dumps scale linearly with the number of nexthops
    https://git.kernel.org/netdev/net-next/c/5236f57e7c03
  - [net-next,v2,2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary
    https://git.kernel.org/netdev/net-next/c/b0ac6d3b56a2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-08-20  1:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-16 23:12 [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops Christoph Paasch via B4 Relay
2025-08-16 23:12 ` [PATCH net-next v2 1/2] net: Make nexthop-dumps scale linearly with the number " Christoph Paasch via B4 Relay
2025-08-17  8:42   ` Ido Schimmel
2025-08-17  9:40   ` Nikolay Aleksandrov
2025-08-18  9:54   ` Eric Dumazet
2025-08-16 23:12 ` [PATCH net-next v2 2/2] net: When removing nexthops, don't call synchronize_net if it is not necessary Christoph Paasch via B4 Relay
2025-08-17  8:44   ` Ido Schimmel
2025-08-17  9:40   ` Nikolay Aleksandrov
2025-08-18 10:02   ` Eric Dumazet
2025-08-18 13:33 ` [PATCH net-next v2 0/2] net: Speedup some nexthop handling when having A LOT of nexthops David Ahern
2025-08-20  1:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).