netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
@ 2025-01-07 17:38 Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 1/4] net: no longer assume RTNL is held in flush_all_backlogs() Eric Dumazet
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 17:38 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, eric.dumazet, Eric Dumazet

One major source of RTNL contention resides in unregister_netdevice()

Due to RCU protection of various network structures, and
unregister_netdevice() being a synchronous function,
it is calling potentially slow functions while holding RTNL.

I think we can release RTNL in two points, so that three
slow functions are called while RTNL can be used
by other threads.

Eric Dumazet (4):
  net: no longer assume RTNL is held in flush_all_backlogs()
  net: no longer hold RTNL while calling flush_all_backlogs()
  net: expedite synchronize_net() for cleanup_net()
  net: reduce RTNL hold duration in unregister_netdevice_many_notify()

 include/net/net_namespace.h |  2 ++
 net/core/dev.c              | 61 +++++++++++++++++++++++++------------
 net/core/net_namespace.c    |  5 +++
 3 files changed, 48 insertions(+), 20 deletions(-)

-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/4] net: no longer assume RTNL is held in flush_all_backlogs()
  2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
@ 2025-01-07 17:38 ` Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 2/4] net: no longer hold RTNL while calling flush_all_backlogs() Eric Dumazet
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 17:38 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, eric.dumazet, Eric Dumazet

flush_all_backlogs() uses per-cpu data to hold its
temporary daya, on the assumption it is called under RTNL
protection.

Following patch in the series will break this assumption.

Use instead a dynamically allocated piece of memory.

In the unlikely case the allocation fails,
use a boot-time allocated memory.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c | 53 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 18 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 073f682a9653a212198b12bae17fafe7b46f96e9..9e1eb272e4feaf40dc87defd54d691634e0902e5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5961,8 +5961,6 @@ void netif_receive_skb_list(struct list_head *head)
 }
 EXPORT_SYMBOL(netif_receive_skb_list);
 
-static DEFINE_PER_CPU(struct work_struct, flush_works);
-
 /* Network device is going away, flush any packets still pending */
 static void flush_backlog(struct work_struct *work)
 {
@@ -6019,36 +6017,54 @@ static bool flush_required(int cpu)
 	return true;
 }
 
+struct flush_backlogs {
+	cpumask_t		flush_cpus;
+	struct work_struct	w[];
+};
+
+static struct flush_backlogs *flush_backlogs_alloc(void)
+{
+	return kmalloc(struct_size_t(struct flush_backlogs, w, nr_cpu_ids),
+		       GFP_KERNEL);
+}
+
+static struct flush_backlogs *flush_backlogs_fallback;
+static DEFINE_MUTEX(flush_backlogs_mutex);
+
 static void flush_all_backlogs(void)
 {
-	static cpumask_t flush_cpus;
+	struct flush_backlogs *ptr = flush_backlogs_alloc();
 	unsigned int cpu;
 
-	/* since we are under rtnl lock protection we can use static data
-	 * for the cpumask and avoid allocating on stack the possibly
-	 * large mask
-	 */
-	ASSERT_RTNL();
+	if (!ptr) {
+		mutex_lock(&flush_backlogs_mutex);
+		ptr = flush_backlogs_fallback;
+	}
+	cpumask_clear(&ptr->flush_cpus);
 
 	cpus_read_lock();
 
-	cpumask_clear(&flush_cpus);
 	for_each_online_cpu(cpu) {
 		if (flush_required(cpu)) {
-			queue_work_on(cpu, system_highpri_wq,
-				      per_cpu_ptr(&flush_works, cpu));
-			cpumask_set_cpu(cpu, &flush_cpus);
+			INIT_WORK(&ptr->w[cpu], flush_backlog);
+			queue_work_on(cpu, system_highpri_wq, &ptr->w[cpu]);
+			__cpumask_set_cpu(cpu, &ptr->flush_cpus);
 		}
 	}
 
 	/* we can have in flight packet[s] on the cpus we are not flushing,
 	 * synchronize_net() in unregister_netdevice_many() will take care of
-	 * them
+	 * them.
 	 */
-	for_each_cpu(cpu, &flush_cpus)
-		flush_work(per_cpu_ptr(&flush_works, cpu));
+	for_each_cpu(cpu, &ptr->flush_cpus)
+		flush_work(&ptr->w[cpu]);
 
 	cpus_read_unlock();
+
+	if (ptr != flush_backlogs_fallback)
+		kfree(ptr);
+	else
+		mutex_unlock(&flush_backlogs_mutex);
 }
 
 static void net_rps_send_ipi(struct softnet_data *remsd)
@@ -12237,12 +12253,13 @@ static int __init net_dev_init(void)
 	 *	Initialise the packet receive queues.
 	 */
 
+	flush_backlogs_fallback = flush_backlogs_alloc();
+	if (!flush_backlogs_fallback)
+		goto out;
+
 	for_each_possible_cpu(i) {
-		struct work_struct *flush = per_cpu_ptr(&flush_works, i);
 		struct softnet_data *sd = &per_cpu(softnet_data, i);
 
-		INIT_WORK(flush, flush_backlog);
-
 		skb_queue_head_init(&sd->input_pkt_queue);
 		skb_queue_head_init(&sd->process_queue);
 #ifdef CONFIG_XFRM_OFFLOAD
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/4] net: no longer hold RTNL while calling flush_all_backlogs()
  2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 1/4] net: no longer assume RTNL is held in flush_all_backlogs() Eric Dumazet
@ 2025-01-07 17:38 ` Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net() Eric Dumazet
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 17:38 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, eric.dumazet, Eric Dumazet

flush_all_backlogs() is called from unregister_netdevice_many_notify()
as part of netdevice dismantles.

This is currently called under RTNL, and can last up to 20ms on busy hosts.

There is no reason to block RTNL at this stage.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9e1eb272e4feaf40dc87defd54d691634e0902e5..ef6426aad84dc00740a1716c8fd4cfd48ee17cf3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11526,8 +11526,10 @@ void unregister_netdevice_many_notify(struct list_head *head,
 		unlist_netdevice(dev);
 		WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERING);
 	}
-	flush_all_backlogs();
 
+	__rtnl_unlock();
+	flush_all_backlogs();
+	rtnl_lock();
 	synchronize_net();
 
 	list_for_each_entry(dev, head, unreg_list) {
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net()
  2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 1/4] net: no longer assume RTNL is held in flush_all_backlogs() Eric Dumazet
  2025-01-07 17:38 ` [PATCH net-next 2/4] net: no longer hold RTNL while calling flush_all_backlogs() Eric Dumazet
@ 2025-01-07 17:38 ` Eric Dumazet
  2025-01-09 17:42   ` kernel test robot
  2025-01-09 18:14   ` kernel test robot
  2025-01-07 17:38 ` [PATCH net-next 4/4] net: reduce RTNL hold duration in unregister_netdevice_many_notify() Eric Dumazet
  2025-01-07 20:11 ` [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Jakub Kicinski
  4 siblings, 2 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 17:38 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, eric.dumazet, Eric Dumazet

cleanup_net() is the single thread responsible
for netns dismantles, and a potential bottleneck.

Before we can get per-netns RTNL, make sure
all synchronize_net() called from this thread
are using rcu_synchronize_expedited().

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/net_namespace.h | 2 ++
 net/core/dev.c              | 2 +-
 net/core/net_namespace.c    | 5 +++++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 5a2a0df8ad91b677b515b392869c6c755be5c868..a3009bdd7efec0a3b665cbf51c159c323458410a 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -565,4 +565,6 @@ void net_ns_init(void);
 static inline void net_ns_init(void) {}
 #endif
 
+extern struct task_struct *cleanup_net_task;
+
 #endif /* __NET_NET_NAMESPACE_H */
diff --git a/net/core/dev.c b/net/core/dev.c
index ef6426aad84dc00740a1716c8fd4cfd48ee17cf3..342ab7d6001da8983db450f50327fc7915b0a8ba 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11423,7 +11423,7 @@ EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
 void synchronize_net(void)
 {
 	might_sleep();
-	if (rtnl_is_locked())
+	if (current == cleanup_net_task || rtnl_is_locked())
 		synchronize_rcu_expedited();
 	else
 		synchronize_rcu();
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index b5cd3ae4f04cf28d43f8401a3dafebac4a297123..cb39a12b2f8295c605f08b5589932932150a1644 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -588,6 +588,8 @@ static void unhash_nsid(struct net *net, struct net *last)
 
 static LLIST_HEAD(cleanup_list);
 
+struct task_struct *cleanup_net_task;
+
 static void cleanup_net(struct work_struct *work)
 {
 	const struct pernet_operations *ops;
@@ -596,6 +598,8 @@ static void cleanup_net(struct work_struct *work)
 	LIST_HEAD(net_exit_list);
 	LIST_HEAD(dev_kill_list);
 
+	cleanup_net_task = current;
+
 	/* Atomically snapshot the list of namespaces to cleanup */
 	net_kill_list = llist_del_all(&cleanup_list);
 
@@ -670,6 +674,7 @@ static void cleanup_net(struct work_struct *work)
 		put_user_ns(net->user_ns);
 		net_free(net);
 	}
+	cleanup_net_task = NULL;
 }
 
 /**
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 4/4] net: reduce RTNL hold duration in unregister_netdevice_many_notify()
  2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
                   ` (2 preceding siblings ...)
  2025-01-07 17:38 ` [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net() Eric Dumazet
@ 2025-01-07 17:38 ` Eric Dumazet
  2025-01-07 20:11 ` [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Jakub Kicinski
  4 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 17:38 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: netdev, Simon Horman, eric.dumazet, Eric Dumazet

Two synchronize_net() calls are currently done while holding RTNL.

This is source of RTNL contention in workloads adding and deleting
many network namespaces per second, because synchronize_rcu()
and synchronize_rcu_expedited() can use 10+ ms in some cases.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/core/dev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 342ab7d6001da8983db450f50327fc7915b0a8ba..9e93b13b9a76bd256d93d05a13d21dca883d6ab8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11529,8 +11529,8 @@ void unregister_netdevice_many_notify(struct list_head *head,
 
 	__rtnl_unlock();
 	flush_all_backlogs();
-	rtnl_lock();
 	synchronize_net();
+	rtnl_lock();
 
 	list_for_each_entry(dev, head, unreg_list) {
 		struct sk_buff *skb = NULL;
@@ -11590,7 +11590,9 @@ void unregister_netdevice_many_notify(struct list_head *head,
 #endif
 	}
 
+	__rtnl_unlock();
 	synchronize_net();
+	rtnl_lock();
 
 	list_for_each_entry(dev, head, unreg_list) {
 		netdev_put(dev, &dev->dev_registered_tracker);
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
  2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
                   ` (3 preceding siblings ...)
  2025-01-07 17:38 ` [PATCH net-next 4/4] net: reduce RTNL hold duration in unregister_netdevice_many_notify() Eric Dumazet
@ 2025-01-07 20:11 ` Jakub Kicinski
  2025-01-07 20:22   ` Eric Dumazet
  4 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2025-01-07 20:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, Simon Horman, eric.dumazet

On Tue,  7 Jan 2025 17:38:34 +0000 Eric Dumazet wrote:
> One major source of RTNL contention resides in unregister_netdevice()
> 
> Due to RCU protection of various network structures, and
> unregister_netdevice() being a synchronous function,
> it is calling potentially slow functions while holding RTNL.
> 
> I think we can release RTNL in two points, so that three
> slow functions are called while RTNL can be used
> by other threads.

I think we'll need:

diff --git a/net/devlink/port.c b/net/devlink/port.c
index 939081a0e615..cdfa22453a55 100644
--- a/net/devlink/port.c
+++ b/net/devlink/port.c
@@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
 		__devlink_port_type_set(devlink_port, devlink_port->type,
 					netdev);
 		break;
+	case NETDEV_UNREGISTERING:
 	case NETDEV_UNREGISTER:
 		if (devlink_net(devlink) != dev_net(netdev))
 			return NOTIFY_OK;


There is no other way to speed things up? Use RT prio for the work?
Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?

I'm not gonna stand in your way but in general re-taking caller locks
in a callee is a bit ugly :(

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
  2025-01-07 20:11 ` [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Jakub Kicinski
@ 2025-01-07 20:22   ` Eric Dumazet
  2025-01-07 20:46     ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 20:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Simon Horman, eric.dumazet

On Tue, Jan 7, 2025 at 9:11 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue,  7 Jan 2025 17:38:34 +0000 Eric Dumazet wrote:
> > One major source of RTNL contention resides in unregister_netdevice()
> >
> > Due to RCU protection of various network structures, and
> > unregister_netdevice() being a synchronous function,
> > it is calling potentially slow functions while holding RTNL.
> >
> > I think we can release RTNL in two points, so that three
> > slow functions are called while RTNL can be used
> > by other threads.
>
> I think we'll need:
>
> diff --git a/net/devlink/port.c b/net/devlink/port.c
> index 939081a0e615..cdfa22453a55 100644
> --- a/net/devlink/port.c
> +++ b/net/devlink/port.c
> @@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
>                 __devlink_port_type_set(devlink_port, devlink_port->type,
>                                         netdev);
>                 break;
> +       case NETDEV_UNREGISTERING:

Not sure I follow ?

>         case NETDEV_UNREGISTER:
>                 if (devlink_net(devlink) != dev_net(netdev))
>                         return NOTIFY_OK;
>
>
> There is no other way to speed things up? Use RT prio for the work?
> Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?
>
> I'm not gonna stand in your way but in general re-taking caller locks
> in a callee is a bit ugly :(

We might restrict this stuff to cleanup_net() caller only, we know the
netns are disappearing
and that no other thread can mess with them.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
  2025-01-07 20:22   ` Eric Dumazet
@ 2025-01-07 20:46     ` Eric Dumazet
  2025-01-07 21:09       ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2025-01-07 20:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Paolo Abeni, netdev, Simon Horman, eric.dumazet

On Tue, Jan 7, 2025 at 9:22 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Jan 7, 2025 at 9:11 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue,  7 Jan 2025 17:38:34 +0000 Eric Dumazet wrote:
> > > One major source of RTNL contention resides in unregister_netdevice()
> > >
> > > Due to RCU protection of various network structures, and
> > > unregister_netdevice() being a synchronous function,
> > > it is calling potentially slow functions while holding RTNL.
> > >
> > > I think we can release RTNL in two points, so that three
> > > slow functions are called while RTNL can be used
> > > by other threads.
> >
> > I think we'll need:
> >
> > diff --git a/net/devlink/port.c b/net/devlink/port.c
> > index 939081a0e615..cdfa22453a55 100644
> > --- a/net/devlink/port.c
> > +++ b/net/devlink/port.c
> > @@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
> >                 __devlink_port_type_set(devlink_port, devlink_port->type,
> >                                         netdev);
> >                 break;
> > +       case NETDEV_UNREGISTERING:
>
> Not sure I follow ?
>
> >         case NETDEV_UNREGISTER:
> >                 if (devlink_net(devlink) != dev_net(netdev))
> >                         return NOTIFY_OK;
> >
> >
> > There is no other way to speed things up? Use RT prio for the work?
> > Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?
> >
> > I'm not gonna stand in your way but in general re-taking caller locks
> > in a callee is a bit ugly :(
>
> We might restrict this stuff to cleanup_net() caller only, we know the
> netns are disappearing
> and that no other thread can mess with them.

ie something like:

diff --git a/net/core/dev.c b/net/core/dev.c
index 9e93b13b9a76bd256d93d05a13d21dca883d6ab8..a555e82adbeda90672c72700e9235a5d271be8fd
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11414,6 +11414,23 @@ struct net_device *alloc_netdev_dummy(int sizeof_priv)
 }
 EXPORT_SYMBOL_GPL(alloc_netdev_dummy);

+static bool from_cleanup_net(void)
+{
+       return current == cleanup_net_task;
+}
+
+static void rtnl_drop_if_cleanup(void)
+{
+       if (from_cleanup_net())
+               __rtnl_unlock();
+}
+
+static void rtnl_acquire_if_cleanup(void)
+{
+       if (from_cleanup_net())
+               rtnl_lock();
+}
+
 /**
  *     synchronize_net -  Synchronize with packet receive processing
  *
@@ -11423,7 +11440,7 @@ EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
 void synchronize_net(void)
 {
        might_sleep();
-       if (current == cleanup_net_task || rtnl_is_locked())
+       if (from_cleanup_net() || rtnl_is_locked())
                synchronize_rcu_expedited();
        else
                synchronize_rcu();
@@ -11527,10 +11544,10 @@ void unregister_netdevice_many_notify(struct
list_head *head,
                WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERING);
        }

-       __rtnl_unlock();
+       rtnl_drop_if_cleanup();
        flush_all_backlogs();
        synchronize_net();
-       rtnl_lock();
+       rnl_acquire_if_cleanup();

        list_for_each_entry(dev, head, unreg_list) {
                struct sk_buff *skb = NULL;
@@ -11590,9 +11607,9 @@ void unregister_netdevice_many_notify(struct
list_head *head,
 #endif
        }

-       __rtnl_unlock();
+       rtnl_drop_if_cleanup();
        synchronize_net();
-       rtnl_lock();
+       rnl_acquire_if_cleanup();

        list_for_each_entry(dev, head, unreg_list) {
                netdev_put(dev, &dev->dev_registered_tracker);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice()
  2025-01-07 20:46     ` Eric Dumazet
@ 2025-01-07 21:09       ` Jakub Kicinski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2025-01-07 21:09 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Paolo Abeni, netdev, Simon Horman, eric.dumazet

On Tue, 7 Jan 2025 21:46:41 +0100 Eric Dumazet wrote:
> > > I think we'll need:
> > >
> > > diff --git a/net/devlink/port.c b/net/devlink/port.c
> > > index 939081a0e615..cdfa22453a55 100644
> > > --- a/net/devlink/port.c
> > > +++ b/net/devlink/port.c
> > > @@ -1311,6 +1311,7 @@ int devlink_port_netdevice_event(struct notifier_block *nb,
> > >                 __devlink_port_type_set(devlink_port, devlink_port->type,
> > >                                         netdev);
> > >                 break;
> > > +       case NETDEV_UNREGISTERING:  
> >
> > Not sure I follow ?

I was worried some code assumed devlink_port->netdev is safe to access
under rtnl_lock. But looking closer it's only used in trivial ways, so
you can ignore that.

> > >         case NETDEV_UNREGISTER:
> > >                 if (devlink_net(devlink) != dev_net(netdev))
> > >                         return NOTIFY_OK;
> > >
> > >
> > > There is no other way to speed things up? Use RT prio for the work?
> > > Maybe WRITE_ONCE() a special handler into backlog.poll, and schedule it?
> > >
> > > I'm not gonna stand in your way but in general re-taking caller locks
> > > in a callee is a bit ugly :(  
> >
> > We might restrict this stuff to cleanup_net() caller only, we know the
> > netns are disappearing and that no other thread can mess with them.  

Unless the interface has a peer in another netns. But that should be
fine, interface will be de-listed. I'm slightly more concerned that some
random code in the kernel assumes its stashed netdev pointer to be valid
under rtnl_lock, as long as it has not seen an NETDEV_UNREGISTER event.
I guess we'll find out..? :)

> ie something like:
> [...]

LGTM!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net()
  2025-01-07 17:38 ` [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net() Eric Dumazet
@ 2025-01-09 17:42   ` kernel test robot
  2025-01-09 18:14   ` kernel test robot
  1 sibling, 0 replies; 11+ messages in thread
From: kernel test robot @ 2025-01-09 17:42 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: oe-kbuild-all, netdev, Simon Horman, eric.dumazet, Eric Dumazet

Hi Eric,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Eric-Dumazet/net-no-longer-assume-RTNL-is-held-in-flush_all_backlogs/20250108-014049
base:   net-next/main
patch link:    https://lore.kernel.org/r/20250107173838.1130187-4-edumazet%40google.com
patch subject: [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net()
config: mips-bmips_be_defconfig (https://download.01.org/0day-ci/archive/20250110/202501100127.LhRwhYMs-lkp@intel.com/config)
compiler: mips-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250110/202501100127.LhRwhYMs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501100127.LhRwhYMs-lkp@intel.com/

All errors (new ones prefixed by >>):

   mips-linux-ld: net/core/dev.o: in function `synchronize_net':
   dev.c:(.text+0x2250): undefined reference to `cleanup_net_task'
>> mips-linux-ld: dev.c:(.text+0x225c): undefined reference to `cleanup_net_task'
   mips-linux-ld: net/core/dev.o: in function `free_netdev':
   dev.c:(.text+0x5824): undefined reference to `cleanup_net_task'
   mips-linux-ld: net/core/dev.o: in function `netdev_rx_handler_unregister':
   dev.c:(.text+0x6204): undefined reference to `cleanup_net_task'
   mips-linux-ld: dev.c:(.text+0x620c): undefined reference to `cleanup_net_task'
   mips-linux-ld: net/core/dev.o:dev.c:(.text+0x6764): more undefined references to `cleanup_net_task' follow

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net()
  2025-01-07 17:38 ` [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net() Eric Dumazet
  2025-01-09 17:42   ` kernel test robot
@ 2025-01-09 18:14   ` kernel test robot
  1 sibling, 0 replies; 11+ messages in thread
From: kernel test robot @ 2025-01-09 18:14 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: oe-kbuild-all, netdev, Simon Horman, eric.dumazet, Eric Dumazet

Hi Eric,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Eric-Dumazet/net-no-longer-assume-RTNL-is-held-in-flush_all_backlogs/20250108-014049
base:   net-next/main
patch link:    https://lore.kernel.org/r/20250107173838.1130187-4-edumazet%40google.com
patch subject: [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net()
config: powerpc-fsp2_defconfig (https://download.01.org/0day-ci/archive/20250110/202501100222.pLf6muKs-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250110/202501100222.pLf6muKs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202501100222.pLf6muKs-lkp@intel.com/

All errors (new ones prefixed by >>):

   powerpc-linux-ld: net/core/dev.o: in function `synchronize_net':
   net/core/dev.c:11426:(.text+0x1be2): undefined reference to `cleanup_net_task'
>> powerpc-linux-ld: net/core/dev.c:11426:(.text+0x1be6): undefined reference to `cleanup_net_task'
   powerpc-linux-ld: net/core/dev.c:11426:(.text+0x68ce): undefined reference to `cleanup_net_task'
   powerpc-linux-ld: net/core/dev.c:11426:(.text+0x68d2): undefined reference to `cleanup_net_task'
   powerpc-linux-ld: net/core/dev.o: in function `free_netdev':
   net/core/dev.c:11372:(.text+0x6ae2): undefined reference to `cleanup_net_task'
   powerpc-linux-ld: net/core/dev.o:net/core/dev.c:11372: more undefined references to `cleanup_net_task' follow

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-01-09 18:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-07 17:38 [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Eric Dumazet
2025-01-07 17:38 ` [PATCH net-next 1/4] net: no longer assume RTNL is held in flush_all_backlogs() Eric Dumazet
2025-01-07 17:38 ` [PATCH net-next 2/4] net: no longer hold RTNL while calling flush_all_backlogs() Eric Dumazet
2025-01-07 17:38 ` [PATCH net-next 3/4] net: expedite synchronize_net() for cleanup_net() Eric Dumazet
2025-01-09 17:42   ` kernel test robot
2025-01-09 18:14   ` kernel test robot
2025-01-07 17:38 ` [PATCH net-next 4/4] net: reduce RTNL hold duration in unregister_netdevice_many_notify() Eric Dumazet
2025-01-07 20:11 ` [PATCH net-next 0/4] net: reduce RTNL pressure in unregister_netdevice() Jakub Kicinski
2025-01-07 20:22   ` Eric Dumazet
2025-01-07 20:46     ` Eric Dumazet
2025-01-07 21:09       ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).